`pyts.transformation`.WEASEL¶

class pyts.transformation.WEASEL(word_size=4, n_bins=4, window_sizes=[0.1, 0.3, 0.5, 0.7, 0.9], window_steps=None, anova=True, drop_sum=True, norm_mean=True, norm_std=True, strategy='entropy', chi2_threshold=2, sparse=True, alphabet=None)[source]¶

Word ExtrAction for time SEries cLassification.

Parameters:

word_size : int (default = 4)

Size of each word.

n_bins : int (default = 4)

The number of bins to produce. It must be between 2 and 26.

window_sizes : array-like (default = [0.1, 0.3, 0.5, 0.7, 0.9])

Size of the sliding windows. All the elements must be either integers or floats. In the latter case, each element represents the percentage of the size of each time series and must be between 0 and 1; the size of the sliding windows will be computed as np.ceil(window_sizes * n_timestamps).

window_steps : None or array-like (default = None)

Step of the sliding windows. If None, each window_step is equal to window_size so that the windows are non-overlapping. Otherwise, all the elements must be either integers or floats. In the latter case, each element represents the percentage of the size of each time series and must be between 0 and 1; the step of the sliding windows will be computed as np.ceil(window_steps * n_timestamps).

anova : bool (default = True)

If True, the Fourier coefficient selection is done via a one-way ANOVA test. If False, the first Fourier coefficients are selected.

drop_sum : bool (default = True)

If True, the first Fourier coefficient (i.e. the sum of the subseries) is dropped. Otherwise, it is kept.

norm_mean : bool (default = True)

If True, center each subseries before scaling.

norm_std : bool (default = True)

If True, scale each subseries to unit variance.

strategy : str (default = ‘entropy’)

Strategy used to define the widths of the bins:

‘uniform’: All bins in each sample have identical widths
‘quantile’: All bins in each sample have the same number of points
‘normal’: Bin edges are quantiles from a standard normal distribution
‘entropy’: Bin edges are computed using information gain

chi2_threshold : int or float (default = 2)

The threshold used to perform feature selection. Only the words with a chi2 statistic above this threshold will be kept.

sparse : bool (default = True)

Return a sparse matrix if True, else return an array.

alphabet : None, ‘ordinal’ or array-like, shape = (n_bins,)

Alphabet to use. If None, the first n_bins letters of the Latin alphabet are used.

References

[1]	P. Schäfer, and U. Leser, “Fast and Accurate Time Series Classification with WEASEL”. Conference on Information and Knowledge Management, 637-646 (2017).

Examples

>>> from pyts.datasets import load_gunpoint
>>> from pyts.transformation import WEASEL
>>> X_train, _, y_train, _ = load_gunpoint(return_X_y=True)
>>> weasel = WEASEL(sparse=False)
>>> weasel.fit(X_train, y_train)
WEASEL(...)
>>> weasel.transform(X_train)
array(...)

Attributes:	vocabulary_ : dict A mapping of features indices to terms.

Methods

`__init__`([word_size, n_bins, window_sizes, …])	Initialize self.
`fit`(X, y)	Fit the model according to the given training data.
`fit_transform`(X, y)	Fit the data then transform it.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Transform the provided data.

__init__(word_size=4, n_bins=4, window_sizes=[0.1, 0.3, 0.5, 0.7, 0.9], window_steps=None, anova=True, drop_sum=True, norm_mean=True, norm_std=True, strategy='entropy', chi2_threshold=2, sparse=True, alphabet=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

fit(X, y)[source]¶

Fit the model according to the given training data.

Parameters:	X : array-like, shape = (n_samples, n_timestamps) Training vector. y : array-like, shape = (n_samples,) Class labels for each data sample.
Returns:	self : object

fit_transform(X, y)[source]¶

Fit the data then transform it.

Parameters:	X : array-like, shape = (n_samples, n_timestamps) Train samples. y : array-like, shape = (n_samples,) Class labels for each data sample.
Returns:	X_new : array, shape (n_samples, n_words) Document-term matrix.

get_metadata_routing()¶

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:	routing : MetadataRequest A `MetadataRequest` encapsulating routing information.

get_params(deep=True)¶

Get parameters for this estimator.

Parameters:	deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : dict Parameter names mapped to their values.

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:	**params : dict Estimator parameters.
Returns:	self : estimator instance Estimator instance.

transform(X)[source]¶

Transform the provided data.

Parameters:	X : array-like, shape (n_samples, n_timestamps) Test samples.
Returns:	X_new : sparse matrix, shape = (n_samples, n_features) Document-term matrix with relevant features only.

Examples using `pyts.transformation.WEASEL`¶

Word ExtrAction for time SEries cLassification (WEASEL)

`pyts.transformation`.WEASEL¶

Examples using `pyts.transformation.WEASEL`¶

Navigation

Related Topics

pyts.transformation.WEASEL¶

Examples using pyts.transformation.WEASEL¶

`pyts.transformation`.WEASEL¶

Examples using `pyts.transformation.WEASEL`¶