pyts.datasets.load_coffee

pyts.datasets.load_coffee(return_X_y=False)[source]

Load and return the Coffee dataset.

Food spectrographs are used in chemometrics to classify food types, a task that has obvious applications in food safety and quality assurance. The coffee data set is a two class problem to distinguish between Robusta and Aribica coffee beans.

Training samples 28
Test samples 28
Timestamps 286
Classes 2
Parameters:
return_X_y : bool (default = False)

If True, return (data_train, data_test, target_train, target_test) instead of a Bunch object.

Returns:
data : Bunch

Dictionary-like object, with attributes:

data_train : array of floats

The time series in the training set.

data_test : array of floats

The time series in the test set.

target_train : array of integers

The classification labels in the training set.

target_test : array of integers

The classification labels in the test set.

DESCR : str

The full description of the dataset.

url : str

The url of the dataset.

(data_train, data_test, target_train, target_test) : tuple if return_X_y is True

References

[1]R. Briandet, E.K. Kemsley, and R.H. Wilson, “Discrimination of Arabica and Robusta in Instant Coffee by Fourier Transform Infrared Spectroscopy and Chemometrics”. Journal of Agricultural and Food Chemistry (1996).
[2]A. Bagnall, L. Davis, J. Hills and J. Lines, “Transformation Based Ensembles for Time Series Classification”. SDM (2012).
[3]UCR archive entry for the PigCVP dataset

Examples

>>> from pyts.datasets import load_coffee
>>> bunch = load_coffee()
>>> bunch.data_train.shape
(28, 286)
>>> X_train, X_test, y_train, y_test = load_coffee(return_X_y=True)
>>> X_train.shape
(28, 286)