11. Dataset loading utilities¶
The UEA & UCR Time Series Classification Repository hosts a lot of datasets for time series classification. A few datasets are available in the pyts repository itself, and functions to download the other datasets are made available.
11.1. Simulated datasets¶
The make_cylinder_bell_funnel()
function makes a synthetic dataset
of univariate time series with three classes: cylinder, bell and funnel.
This dataset was introduced by N. Saito in his PhD thesis
“Local feature extraction and its application using a library of bases”.
The time series are generated from the following distributions:
where:
,
is an integer-valued uniform random variable on the interval
,
is an integer-valued uniform distribution on the interval
,
and
are standard normal variables,
is the characteristic function on the interval
.
,
, and
stand for “cylinder”, “bell”, and “funnel” respectively.
11.2. Univariate time series: UCR repository¶
pyts comes with a copy of three univariate time series datasets:
load_coffee()
: load the Coffee dataset,load_gunpoint()
: load the GunPoint dataset,load_pig_central_venous_pressure()
: load the Pig Central Venous Pressure dataset.
The characteristics of these datasets are summarized in the following table:
Type | Name | Train | Test | Class | Length |
---|---|---|---|---|---|
SPECTRO | Coffee | 100 | 100 | 2 | 96 |
MOTION | GunPoint | 50 | 150 | 2 | 150 |
HEMODYNAMICS | PigCVP | 104 | 208 | 52 | 2000 |
Three functions are made available to fetch other datasets from this repository:
ucr_dataset_list()
: return the list of available datasets,ucr_dataset_info()
: return a dictionary with the characteristics of each dataset,fetch_ucr_dataset()
: fetch a dataset given its name.
11.3. Multivariate time series: UEA repository¶
pyts comes with a copy of one multivariate time series dataset:
load_basic_motions()
: load the Basic Motions dataset.
Three functions are made available to fetch other datasets from this repository:
uea_dataset_list()
: return the list of available datasets,uea_dataset_info()
: return a dictionary with the characteristics of each dataset,fetch_uea_dataset()
: fetch a dataset given its name.