MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Overview

MINIROCKET

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

arXiv:2012.08791 (preprint)

Until recently, the most accurate methods for time series classification were limited by high computational complexity. ROCKET achieves state-of-the-art accuracy with a fraction of the computational expense of most existing methods by transforming input time series using random convolutional kernels, and using the transformed features to train a linear classifier. We reformulate ROCKET into a new method, MINIROCKET, making it up to 75 times faster on larger datasets, and making it almost deterministic (and optionally, with additional computational expense, fully deterministic), while maintaining essentially the same accuracy. Using this method, it is possible to train and test a classifier on all of 109 datasets from the UCR archive to state-of-the-art accuracy in less than 10 minutes. MINIROCKET is significantly faster than any other method of comparable accuracy (including ROCKET), and significantly more accurate than any other method of even roughly-similar computational expense. As such, we suggest that MINIROCKET should now be considered and used as the default variant of ROCKET.

Please cite as:

@article{dempster_etal_2020,
  author  = {Dempster, Angus and Schmidt, Daniel F and Webb, Geoffrey I},
  title   = {{MINIROCKET}: A Very Fast (Almost) Deterministic Transform for Time Series Classification},
  year    = {2020},
  journal = {arXiv:2012.08791}
}

sktime* / Multivariate

MINIROCKET (including a basic multivariate implementation) is also available through sktime. See the examples.

* for larger datasets (10,000+ training examples), the sktime methods should be integrated with SGD or similar as per softmax.py (replace calls to fit(...) and transform(...) from minirocket.py with calls to the relevant sktime methods as appropriate)

Results

* num_training_examples does not include the validation set of 2,048 training examples, but the transform time for the validation set is included in time_training_seconds

Requirements*

  • Python, NumPy, pandas
  • Numba (0.50+)
  • scikit-learn or similar
  • PyTorch or similar (for larger datasets)

* all pre-packaged with or otherwise available through Anaconda

Code

minirocket.py

minirocket_dv.py (MINIROCKETDV)

softmax.py (PyTorch / 10,000+ Training Examples)

minirocket_multivariate.py (equivalent to sktime/MiniRocketMultivariate)

minirocket_variable.py (variable-length input; experimental)

Important Notes

Compilation

The functions in minirocket.py and minirocket_dv.py are compiled by Numba on import, which may take some time. By default, the compiled functions are now cached, so this should only happen once (i.e., on the first import).

Input Data Type

Input data should be of type np.float32. Alternatively, you can change the Numba signatures to accept, e.g., np.float64.

Normalisation

Unlike ROCKET, MINIROCKET does not require the input time series to be normalised. (However, whether or not it makes sense to normalise the input time series may depend on your particular application.)

Examples

MINIROCKET

from minirocket import fit, transform
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

parameters = fit(X_training)

X_training_transform = transform(X_training, parameters)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test, parameters)

predictions = classifier.predict(X_test_transform)

MINIROCKETDV

from minirocket_dv import fit_transform
from minirocket import transform
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

parameters, X_training_transform = fit_transform(X_training)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test, parameters)

predictions = classifier.predict(X_test_transform)

PyTorch / 10,000+ Training Examples

from softmax import train, predict

model_etc = train("InsectSound_TRAIN_shuffled.csv", num_classes = 10, training_size = 22952)
# note: 22,952 = 25,000 - 2,048 (validation)

predictions, accuracy = predict("InsectSound_TEST.csv", *model_etc)

Variable-Length Input (Experimental)

from minirocket_variable import fit, transform, filter_by_length
from sklearn.linear_model import RidgeClassifierCV

[...] # load data, etc.

# note:
# * input time series do *not* need to be normalised
# * input data should be np.float32

# special instructions for variable-length input:
# * concatenate variable-length input time series into a single 1d numpy array
# * provide another 1d array with the lengths of each of the input time series
# * input data should be np.float32 (as above); lengths should be np.int32

# optionally, use a different reference length when setting dilation (default is
# the length of the longest time series), and use fit(...) with time series of
# at least this length, e.g.:
# >>> reference_length = X_training_lengths.mean()
# >>> X_training_1d_filtered, X_training_lengths_filtered = \
# >>> filter_by_length(X_training_1d, X_training_lengths, reference_length)
# >>> parameters = fit(X_training_1d_filtered, X_training_lengths_filtered, reference_length)

parameters = fit(X_training_1d, X_training_lengths)

X_training_transform = transform(X_training_1d, X_training_lengths, parameters)

classifier = RidgeClassifierCV(alphas = np.logspace(-3, 3, 10), normalize = True)
classifier.fit(X_training_transform, Y_training)

X_test_transform = transform(X_test_1d, X_test_lengths, parameters)

predictions = classifier.predict(X_test_transform)

Acknowledgements

We thank Professor Eamonn Keogh and all the people who have contributed to the UCR time series classification archive. Figures in our paper showing mean ranks were produced using code from Ismail Fawaz et al. (2019).

๐Ÿš€ ๐Ÿš€ ๐Ÿš€
Code for the paper "PortraitNet: Real-time portrait segmentation network for mobile device" @ CAD&Graphics2019

PortraitNet Code for the paper "PortraitNet: Real-time portrait segmentation network for mobile device". @ CAD&Graphics 2019 Introduction We propose a

265 Dec 01, 2022
SPEAR: Semi suPErvised dAta progRamming

Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem

decile-team 91 Dec 06, 2022
๐ŸŒพ PASTIS ๐ŸŒพ Panoptic Agricultural Satellite TIme Series

๐ŸŒพ PASTIS ๐ŸŒพ Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

dE_soot 1 Dec 08, 2021
Python package provinding tools for artistic interactive applications using AI

Documentation redrawing Python package provinding tools for artistic interactive applications using AI Created by ReDrawing Campinas team for the Open

ReDrawing Campinas 1 Sep 30, 2021
Torch code for our CVPR 2018 paper "Residual Dense Network for Image Super-Resolution" (Spotlight)

Residual Dense Network for Image Super-Resolution This repository is for RDN introduced in the following paper Yulun Zhang, Yapeng Tian, Yu Kong, Bine

Yulun Zhang 494 Dec 30, 2022
Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

Steven Liu 147 Dec 03, 2022
Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

Head-and-Neck-Tumour-Segmentation-and-Prediction-of-Patient-Survival Welcome to the Head and Neck Tumour Segmentation and Prediction of Patient Surviv

5 Oct 20, 2022
Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

3D Infomax improves GNNs for Molecular Property Prediction Video | Paper We pre-train GNNs to understand the geometry of molecules given only their 2D

Hannes Stรคrk 95 Dec 30, 2022
Image based Human Fall Detection

Here I integrated the YOLOv5 object detection algorithm with my own created dataset which consists of human activity images to achieve low cost, high accuracy, and real-time computing requirements

UTTEJ KUMAR 12 Dec 11, 2022
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALLยทE Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
Serving PyTorch 1.0 Models as a Web Server in C++

Serving PyTorch Models in C++ This repository contains various examples to perform inference using PyTorch C++ API. Run git clone https://github.com/W

Onur Kaplan 223 Jan 04, 2023
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in ๐Ÿ‡ฐ๐Ÿ‡ท Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 05, 2023
Accelerating BERT Inference for Sequence Labeling via Early-Exit

Sequence-Labeling-Early-Exit Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit Requirement: Please refer to re

ๆŽๅญ็”ท 23 Oct 14, 2022
Official code repository for "Exploring Neural Models for Query-Focused Summarization"

Query-Focused Summarization Official code repository for "Exploring Neural Models for Query-Focused Summarization" This is a work in progress. Expect

Salesforce 29 Dec 18, 2022
Offical code for the paper: "Growing 3D Artefacts and Functional Machines with Neural Cellular Automata" https://arxiv.org/abs/2103.08737

Growing 3D Artefacts and Functional Machines with Neural Cellular Automata Video of more results: https://www.youtube.com/watch?v=-EzztzKoPeo Requirem

Robotics Evolution and Art Lab 51 Jan 01, 2023
Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models

LMPBT Supplementary code for the Paper entitled ``Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models"

1 Sep 29, 2022
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all view

4 Nov 19, 2022
Caffe-like explicit model constructor. C(onfig)Model

cmodel Caffe-like explicit model constructor. C(onfig)Model Installation pip install git+https://github.com/bonlime/cmodel Usage In order to allow usi

1 Feb 18, 2022