pyts.datasets
.fetch_ucr_dataset¶
-
pyts.datasets.
fetch_ucr_dataset
(dataset, use_cache=True, data_home=None, return_X_y=False)[source]¶ Fetch dataset from UCR TSC Archive by name.
Fetched data sets are automatically saved in the
pyts/datasets/_cached_datasets
folder. To avoid downloading the same data set several times, it is highly recommended not to change the default values ofuse_cache
andpath
.Parameters: - dataset : str
Name of the dataset.
- use_cache : bool (default = True)
If True, look if the data set has already been fetched and load the fetched version if it is the case. If False, download the data set from the UCR Time Series Classification Archive.
- data_home : None or str (default = None)
The path of the folder containing the cached data set. If None, the
pyts/datasets/cached_datasets/UCR/
folder is used. If the data set is not found, it is downloaded and cached in this path.- return_X_y : bool (default = False)
If True, returns
(data_train, data_test, target_train, target_test)
instead of a Bunch object. See below for more information about the data and target object.
Returns: - data : Bunch
Dictionary-like object, with attributes:
- data_train : array of floats
The time series in the training set.
- data_test : array of floats
The time series in the test set.
- target_train : array of integers
The classification labels in the training set.
- target_test : array of integers
The classification labels in the test set.
- DESCR : str
The full description of the dataset.
- url : str
The url of the dataset.
- (data_train, data_test, target_train, target_test) : tuple if
return_X_y
is True
Notes
Missing values are represented as NaN’s.
References
[1] H. A. Dau et al, “The UCR Time Series Archive”. arXiv:1810.07758 [cs, stat], 2018. [2] A. Bagnall et al, “The UEA & UCR Time Series Classification Repository”, www.timeseriesclassification.com.