A Python implementation of GRAIL, a generic framework to learn compact time series representations.

Overview

GRAIL

A Python implementation of GRAIL, a generic framework to learn compact time series representations.

Requirements

  • Python 3.6+
  • numpy
  • scipy
  • tslearn

Installation

Installation using pip:

pip install grailts

To install from the source:

python setup.py install

Usage

Full Example

Here is an example where we load a UCR dataset and run approximate k-nearest neighbors on its GRAIL representations:

from GRAIL.TimeSeries import TimeSeries
from GRAIL.Representation import GRAIL
from GRAIL.kNN import kNN

TRAIN, train_labels = TimeSeries.load("ECG200_TRAIN", "UCR")
TEST, test_labels = TimeSeries.load("ECG200_TEST", "UCR")

representation = GRAIL(kernel="SINK", d = 100, gamma = 5)
repTRAIN, repTEST = representation.get_rep_train_test(TRAIN, TEST, exact=True)
neighbors, _, _ = kNN(repTRAIN, repTEST, method="ED", k=5, representation=None,
                              pq_method='opq')

print(neighbors)

Loading Datasets

To load UCR type datasets:

TRAIN, train_labels = TimeSeries.load("ECG200_TRAIN", "UCR")
TEST, test_labels = TimeSeries.load("ECG200_TEST", "UCR")

In this package, we assume that each row of the datasets is a time series.

Fetch GRAIL Representations

To fetch exact GRAIL representations of a training and a test dataset:

representation = GRAIL(kernel="SINK", d = 100, gamma = 5)
repTRAIN, repTEST = representation.get_rep_train_test(TRAIN, TEST, exact=True)

Here d specifies the number of landmark series, and gamma specifies the hyperparameter used for the SINK kernel. If gamma is not specified, it will be tuned by the algorithm.

If a single dataset is used instead:

repX = representation.get_representation(X)

Get Approximate k-Nearest-Neighbors

To get the approximate k-Nearest-Neighbors of TEST in TRAIN use:

neighbors, correlations, return_time = kNN(repTRAIN, repTEST, method="ED", k=5, representation=None,
                              pq_method='opq')

Note that Euclidean Distance in the GRAIL representation space estimates the SINK correlation in the original space.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Amplo 10 May 15, 2022
Machine Learning from Scratch

Machine Learning from Scratch Author: Shengxuan Wang From: Oregon State University Content: Building Machine Learning model from Scratch, without usin

ShawnWang 0 Jul 05, 2022
Implementation of different ML Algorithms from scratch, written in Python 3.x

Implementation of different ML Algorithms from scratch, written in Python 3.x

Gautam J 393 Nov 29, 2022
A Multipurpose Library for Synthetic Time Series Generation in Python

TimeSynth Multipurpose Library for Synthetic Time Series Please cite as: J. R. Maat, A. Malali, and P. Protopapas, “TimeSynth: A Multipurpose Library

278 Dec 26, 2022
scikit-learn is a python module for machine learning built on top of numpy / scipy

About scikit-learn is a python module for machine learning built on top of numpy / scipy. The purpose of the scikit-learn-tutorial subproject is to le

Gael Varoquaux 122 Dec 12, 2022
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
Confidence intervals for scikit-learn forest algorithms

forest-confidence-interval: Confidence intervals for Forest algorithms Forest algorithms are powerful ensemble methods for classification and regressi

272 Dec 01, 2022
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture

monolish is a linear equation solver library that monolithically fuses variable data type, matrix structures, matrix data format, vendor specific data transfer APIs, and vendor specific numerical alg

RICOS Co. Ltd. 179 Dec 21, 2022
Provide an input CSV and a target field to predict, generate a model + code to run it.

automl-gs Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learn

Max Woolf 1.8k Jan 04, 2023
Python module for performing linear regression for data with measurement errors and intrinsic scatter

Linear regression for data with measurement errors and intrinsic scatter (BCES) Python module for performing robust linear regression on (X,Y) data po

Rodrigo Nemmen 56 Sep 27, 2022
Winning solution for the Galaxy Challenge on Kaggle

Winning solution for the Galaxy Challenge on Kaggle

Sander Dieleman 483 Jan 02, 2023
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
Mortality risk prediction for COVID-19 patients using XGBoost models

Mortality risk prediction for COVID-19 patients using XGBoost models Using demographic and lab test data received from the HM Hospitales in Spain, I b

1 Jan 19, 2022
A webpage that utilizes machine learning to extract sentiments from tweets.

Tweets_Classification_Webpage The goal of this project is to be able to predict what rating customers on social media platforms would give to products

Ayaz Nakhuda 1 Dec 30, 2021
Predicting India’s COVID-19 Third Wave with LSTM

Predicting India’s COVID-19 Third Wave with LSTM Complete project of predicting new COVID-19 cases in the next 90 days with LSTM India is seeing a ste

Samrat Dutta 4 Jan 27, 2022
PyHarmonize: Adding harmony lines to recorded melodies in Python

PyHarmonize: Adding harmony lines to recorded melodies in Python About To use this module, the user provides a wav file containing a melody, the key i

Julian Kappler 2 May 20, 2022
A classification model capable of accurately predicting the price of secondhand cars

The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this

Akarsh Singh 2 Sep 13, 2022
Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models.

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow scikit-learn's functionality wit

Soledad Galli 33 Dec 27, 2022
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
Scikit-Garden or skgarden is a garden for Scikit-Learn compatible decision trees and forests.

Scikit-Garden or skgarden (pronounced as skarden) is a garden for Scikit-Learn compatible decision trees and forests.

260 Dec 21, 2022