pyts.classification.LearningShapelets

class pyts.classification.LearningShapelets(n_shapelets_per_size=0.2, min_shapelet_length=0.1, shapelet_scale=3, penalty='l2', tol=0.001, C=1000, learning_rate=1.0, max_iter=1000, multi_class='multinomial', alpha=-100, fit_intercept=True, intercept_scaling=1.0, class_weight=None, verbose=0, random_state=None, n_jobs=None)[source]

Learning Shapelets algorithm.

This estimator consists of two steps: computing the distances between the shapelets and the time series, then computing a logistic regression using these distances as features. This algorithm learns the shapelets as well as the coefficients of the logistic regression.

Parameters:
n_shapelets_per_size : int or float (default = 0.2)

Number of shapelets per size. If float, it represents a fraction of the number of timestamps and the number of shapelets per size is equal to ceil(n_shapelets_per_size * n_timestamps).

min_shapelet_length : int or float (default = 0.1)

Minimum length of the shapelets. If float, it represents a fraction of the number of timestamps and the minimum length of the shapelets per size is equal to ceil(min_shapelet_length * n_timestamps).

shapelet_scale : int (default = 3)

The different scales for the lengths of the shapelets. The lengths of the shapelets are equal to min_shapelet_length * np.arange(1, shapelet_scale + 1). The total number of shapelets (and features) is equal to n_shapelets_per_size * shapelet_scale.

penalty : ‘l1’ or ‘l2’ (default = ‘l2’)

Used to specify the norm used in the penalization.

tol : float (default = 1e-3)

Tolerance for stopping criterion.

C : float (default = 1000)

Inverse of regularization strength. It must be a positive float. Smaller values specify stronger regularization.

learning_rate : float (default = 1.)

Learning rate for gradient descent optimization. It must be a positive float. Note that the learning rate will be automatically decreased if the loss function is not decreasing.

max_iter : int (default = 1000)

Maximum number of iterations for gradient descent algorithm.

multi_class : {‘multinomial’, ‘ovr’, ‘ovo’} (default = ‘multinomial’)

Strategy for multiclass classification. ‘multinomial’ stands for multinomial cross-entropy loss. ‘ovr’ stands for one-vs-rest strategy. ‘ovo’ stands for one-vs-one strategy. Ignored if the classification task is binary.

alpha : float (default = -100)

Scaling term in the softmin function. The lower, the more precised the soft minimum will be. Default value should be good for standardized time series.

fit_intercept : bool (default = True)

Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

intercept_scaling : float (default = 1.)

Scaling of the intercept. Only used if fit_intercept=True.

class_weight : dict, None or ‘balanced’ (default = None)

Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have unit weight. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

verbose : int (default = 0)

Controls the verbosity. It must be a non-negative integer. If positive, loss at each iteration is printed.

random_state : None, int or RandomState instance (default = None)

The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator. If RandomState instance, random_state is the random number generator. If None, the random number generator is the RandomState instance used by np.random.

n_jobs : None or int (default = None)

The number of jobs to use for the computation. Only used if multi_class is ‘ovr’ or ‘ovo’.

Notes

The number of tasks (n_tasks) depends on the value of multi_class and the number of classes. If there are two classes, the number of tasks is equal to 1. If there are more than two classes, the number of tasks is equal to:

  • 1 if multi_class='multinomial'
  • n_classes if multi_class='ovr'
  • n_classes * (n_classes - 1) / 2 if multi_class='ovo'

References

[1]J. Grabocka, N. Schilling, M. Wistuba and L. Schmidt-Thieme, “Learning Time-Series Shapelets”. International Conference on Data Mining, 14, 392-401 (2014).

Examples

>>> from pyts.classification import LearningShapelets
>>> X = [[1, 2, 2, 1, 2, 3, 2],
...      [0, 2, 0, 2, 0, 2, 3],
...      [0, 1, 2, 2, 1, 2, 2]]
>>> y = [0, 1, 0]
>>> clf = LearningShapelets(random_state=42, tol=0.01)
>>> clf.fit(X, y)
LearningShapelets(...)
>>> clf.coef_.shape
(1, 6)
Attributes:
classes_ : array, shape = (n_classes,)

An array of class labels known to the classifier.

shapelets_ : array shape = (n_tasks, n_shapelets)

Learned shapelets. Each element of this array is a learned shapelet.

coef_ : array, shape = (n_tasks, n_shapelets) or (n_classes, n_shapelets)

Coefficients for each shapelet in the decision function.

intercept_ : array, shape = (n_tasks,) or (n_classes,)

Intercepts (a.k.a. biases) added to the decision function. If fit_intercept=False, the intercepts are set to zero.

n_iter_ : array, shape = (n_tasks,)

Actual number of iterations.

Methods

__init__([n_shapelets_per_size, …]) Initialize self.
decision_function(X) Decision function scores.
fit(X, y[, sample_weight]) Fit the model according to the given training data.
get_params([deep]) Get parameters for this estimator.
predict(X) Predict the class labels for the provided data.
predict_proba(X) Probability estimates.
score(X, y[, sample_weight]) Return the mean accuracy on the given test data and labels.
set_params(**params) Set the parameters of this estimator.
__init__(n_shapelets_per_size=0.2, min_shapelet_length=0.1, shapelet_scale=3, penalty='l2', tol=0.001, C=1000, learning_rate=1.0, max_iter=1000, multi_class='multinomial', alpha=-100, fit_intercept=True, intercept_scaling=1.0, class_weight=None, verbose=0, random_state=None, n_jobs=None)[source]

Initialize self. See help(type(self)) for accurate signature.

decision_function(X)[source]

Decision function scores.

Parameters:
X : array-like, shape = (n_samples, n_timestamps)

Test samples.

Returns:
T : array, shape = (n_samples,) or (n_samples, n_classes)

Decision function scores for each sample for each class in the model, where classes are ordered as they are in self.classes_.

fit(X, y, sample_weight=None)[source]

Fit the model according to the given training data.

Parameters:
X : array-like, shape = (n_samples, n_timestamps)

Training vector.

y : array-like, shape = (n_samples,)

Class labels for each data sample.

sample_weight : None or array-like, shape = (n_samples,) (default = None)

Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

Returns:
self : object
get_params(deep=True)

Get parameters for this estimator.

Parameters:
deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : dict

Parameter names mapped to their values.

predict(X)[source]

Predict the class labels for the provided data.

Parameters:
X : array-like, shape = (n_samples, n_timestamps)

Test samples.

Returns:
y_pred : array-like, shape = (n_samples,)

Class labels for each data sample.

predict_proba(X)[source]

Probability estimates.

Parameters:
X : array-like, shape = (n_samples, n_timestamps)

Test samples.

Returns:
T : array, shape = (n_samples, n_classes)

Probability of the samples for each class in the model, where classes are ordered as they are in self.classes_.

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

Parameters:
X : array-like, shape = (n_samples, n_timestamps)

Univariate time series.

y : array-like, shape = (n_samples,)

True labels for X.

sample_weight : None or array-like, shape = (n_samples,) (default = None)

Sample weights.

Returns:
score : float

Mean accuracy of self.predict(X) with regards to y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**params : dict

Estimator parameters.

Returns:
self : estimator instance

Estimator instance.

Examples using pyts.classification.LearningShapelets

Learning Time-Series Shapelets

Learning Time-Series Shapelets

Learning Time-Series Shapelets