`pyts.bag_of_words`.WordExtractor¶

class pyts.bag_of_words.WordExtractor(window_size=0.1, window_step=1, numerosity_reduction=True)[source]¶

Transform discretized time series into sequences of words.

Parameters:

window_size : int or float (default = 0.1): Size of the sliding window (i.e. the size of each word). If float, it represents the percentage of the size of each time series and must be between 0 and 1. The window size will be computed as ceil(window_size * n_timestamps).
window_step : int or float (default = 1): Step of the sliding window. If float, it represents the percentage of the size of each time series and must be between 0 and 1. The window size will be computed as ceil(window_step * n_timestamps).
numerosity_reduction : bool (default = True): If True, delete sample-wise all but one occurence of back to back identical occurences of the same words.

Examples

>>> from pyts.bag_of_words import WordExtractor
>>> X = [['a', 'a', 'b', 'a', 'b', 'b', 'b', 'b', 'a'],
...      ['a', 'b', 'c', 'c', 'c', 'c', 'a', 'a', 'c']]
>>> word = WordExtractor(window_size=2)
>>> print(word.transform(X))
['aa ab ba ab bb ba' 'ab bc cc ca aa ac']
>>> word = WordExtractor(window_size=2, numerosity_reduction=False)
>>> print(word.transform(X))
['aa ab ba ab bb bb bb ba' 'ab bc cc cc cc ca aa ac']

Methods

`__init__`([window_size, window_step, …])	Initialize self.
`fit`([X, y])	Pass.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Transform time series into sequences of words.

__init__(window_size=0.1, window_step=1, numerosity_reduction=True)[source]¶: Initialize self. See help(type(self)) for accurate signature.

fit(X=None, y=None)[source]¶

Pass.

Parameters:	X ignored y Ignored
Returns:	self : object

fit_transform(X, y=None, **fit_params)¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:	X : array-like, shape = (n_samples, n_timestamps) Univariate time series. y : None or array-like, shape = (n_samples,) (default = None) Target values (None for unsupervised transformations). **fit_params : dict Additional fit parameters.
Returns:	X_new : array Transformed array.

get_params(deep=True)¶

Get parameters for this estimator.

Parameters:	deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : dict Parameter names mapped to their values.

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:	**params : dict Estimator parameters.
Returns:	self : estimator instance Estimator instance.

transform(X)[source]¶

Transform time series into sequences of words.

Parameters:	X : array-like, shape = (n_samples, n_timestamps)
Returns:	X_new : array, shape = (n_samples,) Transformed data. Each row is a string consisting of words separated by a whitespace.

Examples using `pyts.bag_of_words.WordExtractor`¶

Word Extractor

`pyts.bag_of_words`.WordExtractor¶

Examples using `pyts.bag_of_words.WordExtractor`¶

Navigation

Related Topics

pyts.bag_of_words.WordExtractor¶

Examples using pyts.bag_of_words.WordExtractor¶

`pyts.bag_of_words`.WordExtractor¶

Examples using `pyts.bag_of_words.WordExtractor`¶