.. _transformation:

====================================
Extracting features from time series
====================================

.. currentmodule:: pyts.transformation

Standard machine learning algorithms are not always well suited for raw
time series because they cannot capture the high correlation between
consecutive time points: treating time points as features may not be optimal.
Therefore, algorithms that extract features from time series have been
developed. These algorithms transforms a dataset of time series with shape
``(n_samples, n_timestamps)`` into a dataset of features with shape
``(n_samples, n_features)`` that can be used to fit a standard classifier.
They can be found in the :mod:`pyts.transformation` module.
The following sections describe the algorithms made available.


ShapeletTransform
-----------------

:class:`ShapeletTransform` is a shapelet-based approach to extract features.
A shapelet is defined as a contiguous subsequence of a time series.
The distance between a shapelet and a time series is defined as the minimum
of the distances between this shapelet and all the shapelets of identical
length extracted from this time series. :class:`ShapeletTransform` extracts
the ``n_shapelets`` most discriminative shapelets given a criterion (mutual
information or F-scores) from a dataset of time series when ``fit`` is called.
The indices of the selected shapelets are made available via the ``indices_``
attribute.

.. figure:: ../auto_examples/transformation/images/sphx_glr_plot_shapelet_transform_001.png
   :target: ../auto_examples/transformation/plot_shapelet_transform.html
   :align: center
   :scale: 80%

:class:`ShapeletTransform` derives the distances between the selected shapelets
and a dataset of time series when ``transform`` is called. ``fit_transform``
is an optimized version of ``fit`` followed by ``transform`` since the distances
between the shapelets and the time series must be computed when ``fit`` is
called::

    >>> from pyts.transformation import ShapeletTransform
    >>> X = [[0, 2, 3, 4, 3, 2, 1],
    ...      [0, 1, 3, 4, 3, 4, 5],
    ...      [2, 1, 0, 2, 1, 5, 4],
    ...      [1, 2, 2, 1, 0, 3, 5]]
    >>> y = [0, 0, 1, 1]
    >>> st = ShapeletTransform(n_shapelets=2, window_sizes=[3])
    >>> X_new = st.fit_transform(X, y)
    >>> X_new.shape()
    (4, 2)

Classification can be performed with any standard classifier. In the example
below, we use a Support Vector Machine with a linear kernel::

    >>> import numpy as np
    >>> from pyts.transformation import ShapeletTransform
    >>> from pyts.datasets import load_gunpoint
    >>> from sklearn.pipeline import make_pipeline
    >>> from sklearn.svm import LinearSVC
    >>> X_train, X_test, y_train, y_test = load_gunpoint(return_X_y=True)
    >>> shapelet = ShapeletTransform(window_sizes=np.arange(10, 130, 3), random_state=42)
    >>> svc = LinearSVC()
    >>> clf = make_pipeline(shapelet, svc)
    >>> clf.fit(X_train, y_train)
    Pipeline(...)
    >>> clf.score(X_test, y_test)
    0.966...

.. topic:: References

    * J. Lines, L. M. Davis, J. Hills and A. Bagnall, "A Shapelet Transform for
      Time Series Classification". Data Mining and Knowledge Discovery,
      289-297 (2012).


BOSS
----

BOSS stands for **B**\ ag **O**\ f **S**\ ymbolic-Fourier-Approximation
**S**\ ymbols. :class:`BOSS` extracts words from time series using the
:ref:`approximation_sfa` algorithm and derives their frequencies for each time
series.

.. figure:: ../auto_examples/transformation/images/sphx_glr_plot_boss_001.png
   :target: ../auto_examples/transformation/plot_boss.html
   :align: center
   :scale: 80%

The ``vocabulary_`` attribute is a mapping from the feature indices to the
corresponding words::

    >>> from pyts.datasets import load_gunpoint
    >>> from pyts.transformation import BOSS
    >>> X_train, X_test, _, _ = load_gunpoint(return_X_y=True)
    >>> boss = BOSS(word_size=2, n_bins=2, sparse=False)
    >>> boss.fit(X_train) # doctest: +ELLIPSIS
    BOSS(...)
    >>> sorted(boss.vocabulary_.values())
    ['aa', 'ab', 'ba', 'bb']
    >>> boss.transform(X_test) # doctest: +ELLIPSIS
    array(...)

Classification can be performed with any standard classifier. In the example
below, we use a k-nearest neighbors classifier with the
:func:`pyts.metrics.boss` metric::

    >>> from pyts.datasets import load_gunpoint
    >>> from pyts.transformation import BOSS
    >>> from pyts.classification import KNeighborsClassifier
    >>> from sklearn.pipeline import make_pipeline
    >>> X_train, X_test, y_train, y_test = load_gunpoint(return_X_y=True)
    >>> boss = BOSS(word_size=8, window_size=40, norm_mean=True, drop_sum=True, sparse=False)
    >>> knn = KNeighborsClassifier(metric='boss')
    >>> clf = make_pipeline(boss, knn)
    >>> clf.fit(X_train, y_train) # doctest: +ELLIPSIS
    Pipeline(...)
    >>> clf.score(X_test, y_test)
    1.0

.. topic:: References

    * P. Schäfer, "The BOSS is concerned with time series classification
      in the presence of noise". Data Mining and Knowledge Discovery,
      29(6), 1505-1530 (2015).

.. _transformation_weasel:

WEASEL
------

WEASEL stands for **W**\ ord **E**\ xtr\ **A**\ ction for time **SE**\ ries
c\ **L**\ assification. While :class:`BOSS` extracts words with a single sliding
window, :class:`WEASEL` extracts words with several sliding windows of different
sizes, and selects the most discriminative words according to the chi-squared
test. The ``vocabulary_`` attribute is a mapping from the feature indices to the
corresponding words.

.. figure:: ../auto_examples/transformation/images/sphx_glr_plot_weasel_001.png
   :target: ../auto_examples/transformation/plot_weasel.html
   :align: center
   :scale: 80%

For new input data, the frequencies of each selected word are derived::

    >>> from pyts.datasets import load_gunpoint
    >>> from pyts.transformation import WEASEL
    >>> X_train, X_test, y_train, _ = load_gunpoint(return_X_y=True)
    >>> weasel = WEASEL(sparse=False)
    >>> weasel.fit(X_train, y_train)
    WEASEL(...)
    >>>len(weasel.vocabulary_)
    73
    >>> weasel.transform(X_test).shape
    (150, 73)

Classification can be performed with any standard classifier. In the example
below, we use a logistic regression::

    >>> import numpy as np
    >>> from pyts.transformation import WEASEL
    >>> from pyts.datasets import load_gunpoint
    >>> from sklearn.pipeline import make_pipeline
    >>> from sklearn.linear_model import LogisticRegression
    >>> X_train, X_test, y_train, y_test = load_gunpoint(return_X_y=True)
    >>> weasel = WEASEL(word_size=4, window_sizes=np.arange(5, 149))
    >>> logistic = LogisticRegression(solver='liblinear')
    >>> clf = make_pipeline(weasel, logistic)
    >>> clf.fit(X_train, y_train)
    Pipeline(...)
    >>> clf.score(X_test, y_test)
    0.96

.. topic:: References

    * P. Schäfer, and U. Leser, "Fast and Accurate Time Series Classification
      with WEASEL". Conference on Information and Knowledge Management,
      637-646 (2017).