User guide

Introduction

A time series is a sequence of values indexed in time order. Given their nature, they are very common in many real-world applications. With the high availability of sensors and the development of Internet of things devices, the amount of time series data and the number of applications become bigger and bigger. Traditional domains using this type of data include finance and econometrics, and these domains have recently been joined by smart grid, earthquake prediction and weather forecasting.

One specific analysis is time series classification: given a time series and a set of classes, one would like to classify this time series. Real-world problems include disease detection using electrocardiogram data, household device classification to reduce carbon footprint, and image classification. Standard machine learning classification is not always well suited for time series because of the possibly high correlation between back-to-back time points. One typical example is the Naive Bayes algorithm, which assumes a conditional independence between each feature given the class. For this reason, algorithms dedicated to time series classification have been developed.

As the Python programming language is becoming more and more popular in the fields of machine learning and data science, the objective of the pyts Python package is to make time series classification easily accessible by providing preprocessing and utility tools, and implementations of state-of-the-art algorithms.

In the following sections we will present the algorithms implemented in pyts. For more information about the algorithms, please have a look at the references and the Introductory examples section. For more information about their implementations in pyts, please have a look at the API Documentation section.

Preprocessing

In machine learning, it is standard to perform some preprocessing on raw data. The pyts.preprocessing module consists of tools to preprocess raw time series.

pyts provides most of the preprocessing tools from scikit-learn, but instead of applying them column-wise (i.e. independently to each feature), they are applied row-wise (i.e. independently to each sample). Available tools are pyts.preprocessing.StandardScaler, pyts.preprocessing.MinMaxScaler, pyts.preprocessing.MaxAbsScaler, pyts.preprocessing.RobustScaler, pyts.preprocessing.PowerTransformer, pyts.preprocessing.QuantileTransformer and pyts.preprocessing.KBinsDiscretizer.

pyts also provides a tool to impute missing values using interpolation: pyts.preprocessing.InterpolationImputer.

Approximation

Time series can be of huge size or be very noisy. It can be useful to sum up the most important information of each time series. Implemented algorithms to approximate a time series can be found in the pyts.approximation module.

The first algorithm is Piecewise Aggregate Approximation (PAA). The main idea of this algorithm is to apply windows along a time series and to take the mean value in each window. It is implemented as pyts.approximation.PiecewiseAggregateApproximation.

The second algorithm is Symbolic Aggregate approXimation (SAX). For each time series, bins are computed using a given strategy. Then each datapoint is replaced by the index of the bin it belongs to. It is implemented as pyts.approximation.SymbolicAggregateApproximation. It is similar to pyts.preprocessing.KBinsDiscretizer.

The third algorithm is Discrete Fourier Transform (DFT). The idea is to approximate a time series with a subsample of its Fourier coefficients. The selected Fourier coefficients are either the first ones (as they represent the trend of the time series) or the ones that discriminate the different classes the most if a vector of class labels is provided. It is implemented as pyts.approximation.DiscreteFourierTransform.

The fourth algorithm is Multiple Coefficient Binning (MCB). The idea is very similar to SAX and the difference is that the discretization is done column-wise instead of row-wise. It is implemented as pyts.approximation.MultipleCoefficientBinning.

The fifth algorithm is Symbolic Fourier Approximation (SFA). It performs DFT followed MCB, i.e. the selected Fourier coefficients of each time series are discretized. It is implemented as pyts.approximation.SymbolicFourierApproximation.

References

  • Eamonn J. Keogh and Michael J. Pazzani. A simple dimensionality reduction technique for fast similarity search in large time series databases. Knowledge Discovery and Data Mining ,2000.
  • Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos. Fast Subsequence Matching in Time-Series Databases. ACM SIGMOD Record, 2000.
  • Jessica Lin, Eamonn Keogh, Li Wei, and Stefano Lonardi. Experiencing SAX: a Novel Symbolic Representation of Time Series. Data Mining and Knowledge Discovery, 2007.
  • Patrick Schäfer and Mikael Högqvist. SFA: A Symbolic Fourier Approximation and Index for Similarity Search in High Dimensional Datasets. ACM International Conference Proceeding Series, 2012.

Bag of Words

Now that you know how you can transform a time series of real numbers into a sequence of symbols, it’s time to create bag of words. These algorithms are can be found in the pyts.bag_of_words module.

The only algorithm is Bag of Words (BOW). It applies a sliding window of fixed length along the sequence of symbols to create words. It is implemented as pyts.bag_of_words.BagOfWords.

Metrics

It is often of interest to be able to compare time series. However, standard metrics like the Euclidean distance are not always well-suited for time series. To tackle this issue, metrics specific to time series have been developed. pyts provides implementations for some of them in the pyts.metrics module.

The most famous metric is Dynamic Time Warping (DTW). It computes the Euclidean distance on the optimal path between two time series. This metric is computationally expensive, thus several variants of DTW have been developed. The ones available in pyts are DTW with a region constraint (Sakoe-Chiba band, Itakura parallelogram), MultiscaleDTW and FastDTW, as well as the classic DTW. Classic DTW and its variants can all be used with a single function: pyts.metrics.dtw().

Another metric available in this package is the BOSS metric. This metric has been introduced with the BOSS algorithm (see below) and computes the Euclidean distance between two time series, but only using the indices where the first time series is not equal to zero. This metric is usually not applied on time series directly, but after the transformation from the BOSS algorithm, where each time series is replaced with its histogram of words.

References

  • Meinard Müller. Dynamic Time Warping (DTW). Information Retrieval for Music and Motion, 2007.
  • Patrick Schäfer. The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 2015.

Transformation

The pyts.transformation module consists of more complex algorithms that transform a dataset of raw time series with shape (n_samples, n_timestamps) into a more standard dataset of features with shape (n_samples, n_features) that can be used as input data for a standard machine learning classification algorithm.

The first algorithm is Bag-of-SFA Symbols (BOSS). Each time series is first transformed into a bag of words using SFA and BOW. Then the frequencies of each word are computed. It is implemented as pyts.transformation.BOSS.

The second algorithm is Word ExtrAction for time SEries cLassification (WEASEL). The idea is similar to BOSS: first it transforms each time series into a bag of words, then it computes the frequencies of each word. WEASEL is more sophisticated in the sense that the selected Fourier coefficients are the most discriminative ones (based on the one-way ANOVA test), several lengths for the sliding window are used and the most discriminative features (i.e. words) are kept (based on the chi-2 test). It is implemented as pyts.transformation.WEASEL.

References

  • Patrick Schäfer. The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 2015.
  • Patrick Schäfer and Ulf Leser. Fast and Accurate Time Series Classification with WEASEL. CoRR, 2017.

Classification

The pyts.classification module consists of several classification algorithms.

The first algorithm implemented is K-Nearest Neighbors. For time series classification it is the go-to algorithm for a good baseline. The most common metrics used for time series classification are the Euclidean distance and the Dynamic Time Warping metric. It extends the implementation from scikit-learn with the metrics available in the pyts.metrics module. It is implemented as pyts.classification.KNeighborsClassifier.

The second algorithm implemented is SAX-VSM. The outline of this algorithm is to first transform raw time series into bags of words using SAX and BOW, then merge, for each class label, all the bags of words for this class label into only one bag of words, and finally compute tf-idf statistics for each bag of words. This leads to a tf-idf vector for each class label. To predict an unlabeled time series, this time series if first transformed into a term frequency vector, then the predicted label is the one giving the highest cosine similarity among the tf-idf vectors learned in the training phase. It is implemented as pyts.classification.SAXVSM.

The third algorithm implemented is Bag-of-SFA Symbols in Vector Space (BOSSVS). The outline of this algorithm is quite similar to the one of SAX-VSM but words are created using SFA instead of SAX. It is implemented as pyts.classification.BOSSVS.

References

  • Senin Pavel and Malinchik Sergey. SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model. Data Mining (ICDM), 2013 IEEE 13th International Conference on, pp.1175,1180, 2013.
  • Patrick Schäfer. Scalable Time Series Classification. DMKD and ECML/PKDD, 2016.

Image

Instead of transforming a time series into a bag of words, it is also possible to transform it into an image. The pyts.image module consists of several algorithms that perform that kind of transformation.

The first algorithm implemented is Recurrence Plot. It transforms a time series into a matrix where each value corresponds to the distance between two trajectories (a trajectory is a subseries, i.e. a subsequence of values of a time series). The matrix can be binarized using a threshold. It is implemented as pyts.image.RecurrencePlot.

The second algorithm implemented is Gramian Angular Field (GAF). First a time series is represented as polar coordinates. Then the time series can be transformed into a Gramian Angular Summation Field (GASF) when the cosine of the sum of the angular coordinates is computed or a Gramian Angular Difference Field (GADF) when the sine of the difference of the angular coordinates is computed. It is implemented as pyts.image.GramianAngularField

The third algorithm implemented is Markov Transition Field (MTF). The outline of the algorithm is to first quantize a time series using SAX, then to compute the Markov transition matrix (the quantized time series is seen as a Markov chain) and finally to compute the Markov transition field from the transition matrix. It is implemented as pyts.image.MarkovTransitionField.

References

  • Jean-Pierre Eckmann, Sylvie Oliffson Kamphorst and David Ruelle. Recurrence Plots of Dynamical Systems. Europhysics Letters, 1987.
  • Nima Hatami, Yann Gavet and Johan Debayle. Classification of Time-Series Images Using Deep Convolutional Neural Networks. arXiv:1710.00886 [cs], 2017.
  • Zhiguang Wang and Tim Oates. Imaging time-series to improve classification and imputation. Proceedings of the 24th International Conference on Artificial Intelligence, 2015.

Decomposition

The pyts.decomposition module consists of algorithms that decompose a time series into several time series. The idea is to distinguish the different parts of time series, such as the trend, the noise, etc.

The only algorithm implemented currently is Singular Spectrum Analysis (SSA). The outline of the algorithm is to first compute a matrix from a time series using lagged vectors, then compute the eigenvalues and eigenvectors of this matrix multiplied by its transpose, then compute the eigenmatrices and finally compute the time series for each eigenmatrice. It is implemented as pyts.decomposition.SingularSpectrumAnalysis.

References

  • Nina Golyandina and Anatoly Zhigljavsky. Singular Spectrum Analysis for Time Series. 2013