ETNA is an easy-to-use time series forecasting framework.

Overview

ETNA Time Series Library

Pipi version PyPI Status

Telegram

Homepage | Documentation | Tutorials | Contribution Guide | Release Notes

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from classic machine learning to SOTA neural networks, models combination methods and smart backtesting. ETNA is designed to make working with time series simple, productive, and fun.

ETNA is the first python open source framework of Tinkoff.ru Artificial Intelligence Center. The library started as an internal product in our company - we use it in over 10+ projects now, so we often release updates. Contributions are welcome - check our Contribution Guide.

Installation

ETNA is on PyPI, so you can use pip to install it.

pip install --upgrade pip
pip install etna-ts

Get started

Here's some example code for a quick start.

import pandas as pd
from etna.datasets.tsdataset import TSDataset
from etna.models import ProphetModel

# Read the data
df = pd.read_csv("example_dataset.csv")
df["timestamp"] = pd.to_datetime(df["timestamp"])

# Create a TSDataset
df = TSDataset.to_dataset(df)
ts = TSDataset(df,freq='1d')

# Choose a horizon
HORIZON = 8

# Fit the model
model = ProphetModel()
model.fit(ts)

# Make the forecast
future_ts = ts.make_future(HORIZON)
forecast_ts = model.forecast(future_ts)

Tutorials

We have also prepared a set of tutorials for an easy introduction:

01. Get started

  • Creating TSDataset and time series plotting
  • Forecast single time series - Simple forecast, Prophet, Catboost
  • Forecast multiple time series

02. Backtest

  • What is backtest and how it works
  • How to run a validation
  • Validation visualisation

03. EDA

  • Visualization
    • Plot
    • Partial autocorrelation
    • Cross-correlation
    • Distribution
  • Outliers
    • Median method
    • Density method

Documentation

ETNA documentation is available here.

Acknowledgments

ETNA.Team

Alekseev Andrey, Shenshina Julia, Gabdushev Martin, Kolesnikov Sergey, Bunin Dmitriy, Chikov Aleksandr, Barinov Nikita, Romantsov Nikolay, Makhin Artem, Denisov Vladislav, Mitskovets Ivan, Munirova Albina

ETNA.Contributors

Levashov Artem, Podkidyshev Aleksey

License

Feel free to use our library in your commercial and private applications.

ETNA is covered by Apache 2.0. Read more about this license here

Comments
  • Notebook with forecasting strategies

    Notebook with forecasting strategies

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [x] Did you update the docs? We use Numpy format for all the methods and classes.
    • [x] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    #825

    opened by scanhex12 46
  • Update Notebooks with new EDA methods

    Update Notebooks with new EDA methods

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    closes #711

    opened by DBcreator 12
  • Fix notebooks in inference track

    Fix notebooks in inference track

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [x] Did you update the docs? We use Numpy format for all the methods and classes.
    • [x] Did you write any new necessary tests?
    • [x] Did you update the CHANGELOG?

    Proposed Changes

    Look #973.

    Closing issues

    Closes #973.

    opened by Mr-Geekman 12
  • Improve sample_acf and sample_pacf plots

    Improve sample_acf and sample_pacf plots

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [x] Did you update the docs? We use Numpy format for all the methods and classes.
    • [x] Did you write any new necessary tests?
    • [x] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    closes #682

    opened by DBcreator 6
  • Classification notebook

    Classification notebook

    Before submitting (must do checklist)

    • [ ] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    opened by alex-hse-repository 6
  • Poc: base classes for deep models and rnn and deepstate with examples

    Poc: base classes for deep models and rnn and deepstate with examples

    Before submitting (must do checklist)

    • [ ] Did you read the contribution guide?
    • [x] Did you update the docs? We use Numpy format for all the methods and classes.
    • [x] Did you write any new necessary tests?
    • [x] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    opened by martins0n 6
  • Enhance `TSDataset` to work with hierarchical series

    Enhance `TSDataset` to work with hierarchical series

    Before submitting (must do checklist)

    • [ ] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    closes #1028

    opened by alex-hse-repository 5
  • Speed up columns slices: `etna.datasets.utils.select_columns`

    Speed up columns slices: `etna.datasets.utils.select_columns`

    🚀 Feature Request

    In a lot of places we use df.loc[:, pd.IndexSlice[segments, column]] to select column from all the segments. It appears to be very slow on a lot of segments.

    We should find places where we use it and make sure that it can be replaced with df.loc[:, pd.IndexSlice[:, column]] without problems.

    Where was some problem with the second choice: #188. We should investigate is it still existing and in which conditions:

    1. Is it applicable for selection only one column? (SklearnTransform selects many)
    2. Can it be avoided by some trick in taking slices (sorting columns for example).

    Proposal

    1. Find all places with slow slice df.loc[:, pd.IndexSlice[segments, column]] where column is scalar. Replace them with function (you can add it etna.datasets.utils). Try to replace slow slice in function with fast slice: df.loc[:, pd.IndexSlice[:, column]. Make sure that in that case we don't have reordering of columns in different pandas versions.
    2. Do the same but with list of values in column (e.g. SklearnTransform) and investigate reordering issue during testing. We want to avoid it without putting all the segments into the slice.
    3. Make some benchmarking that changed transforms (or other calls) become faster. Add code for benchmarking and its results in the comments of PR. E.g. you can take dataframe with 50000 segments, 100 timestamps, 5 additional int columns, 5 additional float columns, 5 additional category columns.

    Test cases

    1. Make sure that current tests pass for scalar case.
    2. Make sure that current tests pass for list case.
    3. Add tests on function for selection of one column.
    4. Add tests on function for selection of multiple columns (in SklearnTransform we had some tests on reordering, it can be useful).

    Additional context

    No response

    enhancement important 
    opened by Mr-Geekman 5
  • Create assemble_pipelines 717

    Create assemble_pipelines 717

    Before submitting (must do checklist)

    • [ ] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    closes #717

    opened by scanhex12 5
  • Fix bugs and documentation for `plot_backtest` and `plot_backtest_interactive`

    Fix bugs and documentation for `plot_backtest` and `plot_backtest_interactive`

    IMPORTANT: Please do not create a Pull Request without creating an issue first.

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [x] Did you update the docs? We use Numpy format for all the methods and classes.
    • [x] Did you write any new necessary tests?
    • [x] Did you update the CHANGELOG?

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] Improvement (non-breaking change which improves an existing feature)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Proposed Changes

    Look #664.

    Related Issue

    #664.

    Closing issues

    Closes #664.

    bug documentation 
    opened by Mr-Geekman 5
  • add flake8-bugbear

    add flake8-bugbear

    IMPORTANT: Please do not create a Pull Request without creating an issue first.

    Before submitting (must do checklist)

    • [x] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Type of Change

    • [x] Examples / docs / tutorials / contributors update
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] Improvement (non-breaking change which improves an existing feature)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Proposed Changes

    Related Issue

    Closing issues

    opened by iKintosh 5
  • Create `BottomUpReconciliator`

    Create `BottomUpReconciliator`

    Before submitting (must do checklist)

    • [ ] Did you read the contribution guide?
    • [ ] Did you update the docs? We use Numpy format for all the methods and classes.
    • [ ] Did you write any new necessary tests?
    • [ ] Did you update the CHANGELOG?

    Proposed Changes

    Closing issues

    closes #1037

    opened by brsnw250 2
  • Create example notebook about hierarchical pipeline

    Create example notebook about hierarchical pipeline

    🚀 Feature Request

    Create example notebook explaining how to work with time series in etna

    Proposal

    • Notebook should contain explanation of the following:
    1. What are the hierarchical time series
    2. How to store the hierarchical time series in etna(hierarchical long format) and how to convert it to etna wide format with to_hierarchical_dataset
    3. How HierarchicalStructure works and how can it be created
    4. How to create TSDataset with hierarchical structure and how exog data works in case of hierarchical dataset
    5. What methods exists to forecast hierarchical time series + which methods we have in the library and how to use them
    6. Compere the HierarchucalPipeline and Pipeline for top-down and bottom-up cases

    Test cases

    No response

    Additional context

    No response

    enhancement notebook 
    opened by alex-hse-repository 0
  • Create `generate_hierarchical_df` method

    Create `generate_hierarchical_df` method

    🚀 Feature Request

    Create method to generate random hierarchical dataset

    Proposal

    1. In etna/datasets/datasets_generation.py create method:
    def generate_hierarchical_df(periods: int,  n_segments: List[int], freq: str = "D", start_time: str = "2000-01-01", ar_coef: Optional[list] = None, sigma: float = 1, random_seed: int = 1) -> pd.Dataframe
    

    Parameters:

    • n_segments -- number of segments on each level
    • Other parameters are the same as in generate_ar_df Description:
    • Validate n_segments: number of segments on each level should be lower than on the next level
    • Generate segments on the last level using generate_ar_df
    • Generate random tree with configuration of nodes on levels from n_segments
    • In the dataframe replace column segment with columns describing the structure of the tree(one column for each level)
    • Node names in the levels should be generated as follows "level_<level_id>_<segment_id>" -- you can come up with better ideas for naming
    • On the bottom level leave the default segment names from generate_ar_df
    1. Add example of creating dataset with hierarchical structure using generate_hierarchical_df to the docs of TSDataset here

    Test cases

    1. Method generate dataframe with correct properties
    • number of segments
    • number of periods
    • columns(timestamp, level columns, target)
    • level columns contains correct values(for some corner cases where randomness does not influence like n_segments=[1, 2])
    1. Check that we can convert this dataframe to the wide format using to_hierarchical_dataset

    Additional context

    No response

    enhancement 
    opened by alex-hse-repository 0
  • [BUG] AttributeError: 'NaiveModel'

    [BUG] AttributeError: 'NaiveModel'

    🐛 Bug Report

    при прохождении стартового мануала получаю ошибку AttributeError: 'NaiveModel' object has no attribute 'context_size'

    Expected behavior

    future_ts = train_ts.make_future(future_steps=HORIZON, tail_steps=model.context_size)

    Как исправить ошибку?

    How To Reproduce

    HORIZON = 8 from etna.models import NaiveModel

    Соответствует модели

    model = NaiveModel(lag=12) model.fit(train_ts)

    Сделайте прогноз

    future_ts = train_ts.make_future(future_steps=HORIZON, tail_steps=model.context_size) forecast_ts = model.forecast(future_ts, prediction_size=HORIZON)

    Environment

    python 3.9 etna: 1.13.0

    Additional context

    No response

    Checklist

    • [x] Bug appears at the latest library version
    bug 
    opened by vukeep 0
  • Create `BottomUpReconciler`

    Create `BottomUpReconciler`

    🚀 Feature Request

    Create reconciler implementing Bottom-Up approach

    Proposal

    1. Create class BottomUpReconciler(BaseReconciler)
    • Check that source_level is lower than target_level
    1. Implement method fit:
      • Receive dataset on the the level which is lower or equal than target_level, source_level
      • Aggregate the dataset to the source_level
      • Set mapping matrix as summing matrix from source to target level

    Test cases

    • Test constructor works correctly with different source and target levels
    • Test method fit saves the correct matrix in mapping_matrix for different source/target levels
      • source=target
      • source<target
      • source>target

    Additional context

    No response

    enhancement 
    opened by alex-hse-repository 0
  • Create `HierarchicalPipeline`

    Create `HierarchicalPipeline`

    🚀 Feature Request

    Create pipeline to process hierarchical time series

    Proposal

    Create class:

    class HierarchicalPipeline(Pipeline):
        def __init__(
                    self,
                    reconciler: BaseReconciler,
                    model: ModelType,
                    transforms: Sequence[Transform] = (),
                    horizon: int = 1
        ):
    
    1. Implement method fit:
    • Fit the reconciler using reconciler.fit method
    • Aggregate dataset on the source_level using reconciler.aggregate
    • Call the fit method of super class with generated dataset
    1. Implement method raw_forecast()
    • Call the forecast method of the super class
    1. Implement method forecast()
    • Call the raw_forecast
    • Generate the target dataset using reconciler.reconcile

    Test cases

    1. Test that after fit pipeline saves correct ts on the source_level of reconciler
    2. Test that raw_forecast generates forecast on the source_level of reconciler
    3. Test that forecast generates forecast on the target_level of reconciler
    4. Test that backtest works and produce correct metrics(you can use constant dataset for example)

    All the tests should cover both top-down and bottom-up reconcilers with correct source and target levels

    Additional context

    blocked by #1037 #1038 #1044

    enhancement 
    opened by alex-hse-repository 0
Releases(1.14.0)
  • 1.14.0(Dec 16, 2022)

    Highlights:

    • Add python 3.10 support (#1005)
    • Add experimental module with TimeSeriesBinaryClassifier and PredictabilityAnalyzer (#985), see example notebook for the ditails (#997)
    • Inference track results: add predict method to pipelines, teach some models to work with context, change hierarchy of base models, update notebook examples (#979)

    Full changelog:

    Added

    • Add python 3.10 support (#1005)
    • Add SumTranform(#1021)
    • Add plot_change_points_interactive (#988)
    • Add experimental module with TimeSeriesBinaryClassifier and PredictabilityAnalyzer (#985)
    • Inference track results: add predict method to pipelines, teach some models to work with context, change hierarchy of base models, update notebook examples (#979)
    • Add get_ruptures_regularization into experimental module (#1001)
    • Add example classification notebook for experimental classification feature (#997)

    Changed

    • Change returned model in get_model of BATSModel, TBATSModel (#987)
    • Add acf_plot, deprecated sample_acf_plot, sample_pacf_plot (#1004)
    • Change returned model in get_model of HoltWintersModel, HoltModel, SimpleExpSmoothingModel (#986)

    Fixed

    • Fix MinMaxDifferenceTransform import (#1030)
    • Fix release docs and docker images cron job (#982)
    • Fix forecast first point with CatBoostPerSegmentModel (#1010)
    • Fix hanging EDA notebook (#1027)
    • Fix hanging EDA notebook v2 + cache clean script (#1034)
    Source code(tar.gz)
    Source code(zip)
  • 1.13.0(Oct 10, 2022)

    Highlights:

    etna.auto module for pipeline greedy search with default pipelines pool wandb sweeps and optuna examples

    Full changelog:

    Added

    • Add greater_is_better property for Metric (#921)
    • etna.auto for greedy search, etna.auto.pool with default pipelines, etna.auto.optuna wrapper for optuna (#895)
    • Add MinMaxDifferenceTransform (#955)
    • Add wandb sweeps and optuna examples (#338)

    Changed

    • Make slicing faster in TSDataset._merge_exog, FilterFeaturesTransform, AddConstTransform, LambdaTransform, LagTransform, LogTransform, SklearnTransform, WindowStatisticsTransform; make CICD test different pandas versions (#900)
    • Mark some tests as long (#929)
    • Fix to_dict with nn models and add unsafe conversion for callbacks (#949)

    Fixed

    • Fix to_dict with function as parameter (#941)
    • Fix native networks to work with generated future equals to horizon (#936)
    • Fix SARIMAXModel to work with exogenous data on pmdarima>=2.0 (#940)
    • Teach catboost to work with encoders (#957)
    Source code(tar.gz)
    Source code(zip)
  • 1.12.0(Sep 5, 2022)

    Highlights:

    • ETNA native MLPModel
    • to_dict method in all the etna objects
    • DirectEnsemble implementing the direct forecasting strategy
    • Notebook about forecasting strategies

    Full changelog:

    Added

    • Function to transform etna objects to dict(#818)
    • MLPModel(#860)
    • DeadlineMovingAverageModel (#827)
    • DirectEnsemble (#824)
    • CICD: untaged docker image cleaner (#856)
    • Notebook about forecasting strategies (#864)
    • Add ChangePointSegmentationTransform, RupturesChangePointsModel (#821)

    Changed

    • Teach AutoARIMAModel to work with out-sample predictions (#830)
    • Make TSDataset.to_flatten faster for big datasets (#848)

    Fixed

    • Type hints for external users by PEP 561 (#868)
    • Type hints for Pipeline.model match models.nn(#768)
    • Fix behavior of SARIMAXModel if simple_differencing=True is set (#837)
    • Bug python3.7 and TypedDict import (867)
    • Fix deprecated pytorch lightning trainer flags (#866)
    • ProphetModel doesn't work with cap and floor regressors (#842)
    • Fix problem with encoding category types in OHE (#843)
    • Change Docker cuda image version from 11.1 to 11.6.2 (#838)
    • Optimize time complexity of determine_num_steps(#864)
    • All warning as errors(#880)
    • Update .gitignore with .DS_Store and checkpoints (#883)
    • Delete ROADMAP.md ([#904]https://github.com/tinkoff-ai/etna/pull/904)
    • Fix ci invalid cache (#896)
    Source code(tar.gz)
    Source code(zip)
  • 1.11.1(Aug 3, 2022)

  • 1.11.0(Jul 25, 2022)

    Highlights:

    • ETNA native RNN and base classes for deep learning models
    • Lambda transform
    • Prophet 1.1 support without c++ compiler dependency
    • Prediction intervals for DeepAR and TFTModel
    • Add known_future parameter to CLI

    Full changelog:

    Added

    • LSTM based RNN and native deep models base classes (#776)
    • Lambda transform (#762)
    • assemble pipelines (#774)
    • Tests on in-sample, out-sample predictions with gap for all models (#785)

    Changed

    • Add columns and mode parameters in plot_correlation_matrix (#726)
    • Add CatBoostPerSegmentModel and CatBoostMultiSegmentModel classes, deprecate CatBoostModelPerSegment and CatBoostModelMultiSegment (#779)
    • Allow Prophet update to 1.1 (#799)
    • Make LagTransform, LogTransform, AddConstTransform vectorized (#756)
    • Improve the behavior of plot_feature_relevance visualizing p-values (#795)
    • Update poetry.core version (#780)
    • Make native prediction intervals for DeepAR (#761)
    • Make native prediction intervals for TFTModel (#770)
    • Test cases for testing inference of models (#794)
    • Wandb.log to WandbLogger (#816)

    Fixed

    • Fix missing prophet in docker images (#767)
    • Add known_future parameter to CLI (#758)
    • FutureWarning: The frame.append method is deprecated. Use pandas.concat instead (#764)
    • Correct ordering if multi-index in backtest (#771)
    • Raise errors in models.nn if they can't make in-sample and some cases out-sample predictions (#813)
    • Teach BATS/TBATS to work with in-sample, out-sample predictions correctly (#806)
    • Github actions cache issue with poetry update (#778)
    Source code(tar.gz)
    Source code(zip)
  • 1.10.0(Jun 15, 2022)

    Highlights:

    • BATS, TBATS and AutoArima models
    • Fix of empirical prediction intervals

    Full changelog:

    Added

    • Add Sign metric (#730)
    • Add AutoARIMA model (#679)
    • Add parameters start, end to some eda methods (#665)
    • Add BATS and TBATS model adapters (#678)
    • Jupyter extension for black (#742)

    Changed

    • Change color of lines in plot_anomalies and plot_clusters, add grid to all plots, make trend line thicker in plot_trend (#705)
    • Change format of holidays for holiday_plot (#708)
    • Make feature selection transforms return columns in inverse_transform(#688)
    • Add xticks parameter for plot_periodogram, clip frequencies to be >= 1 (#706)
    • Make TSDataset method to_dataset work with copy of the passed dataframe (#741)

    Fixed

    • Fix bug when ts.plot does not save figure (#714)
    • Fix bug in plot_clusters (#675)
    • Fix bugs and documentation for cross_corr_plot (#691)
    • Fix bugs and documentation for plot_backtest and plot_backtest_interactive (#700)
    • Make STLTransform to work with NaNs at the beginning (#736)
    • Fix tiny prediction intervals (#722)
    • Fix deepcopy issue for fitted deepmodel (#735)
    • Fix making backtest if all segments start with NaNs (#728)
    • Fix logging issues with backtest while emp intervals using (#747)
    Source code(tar.gz)
    Source code(zip)
  • 1.9.0(May 17, 2022)

    Added

    • Add plot_metric_per_segment (#658)
    • Add metric_per_segment_distribution_plot (#666)

    Changed

    • Remove parameter normalize in linear models (#686)

    Fixed

    • Add missed forecast_params in forecast CLI method (#671)
    • Add _per_segment_average method to the Metric class (#684)
    • Fix get_statistics_relevance_table working with NaNs and categoricals (#672)
    • Fix bugs and documentation for stl_plot (#685)
    • Fix cuda docker images (#694])
    Source code(tar.gz)
    Source code(zip)
  • 1.8.0(Apr 28, 2022)

    Added

    • Width and Coverage metrics for prediction intervals (#638)
    • Masked backtest (#613)
    • Add seasonal_plot (#628)
    • Add plot_periodogram (#606)
    • Add support of quantiles in backtest (#652)
    • Add prediction_actual_scatter_plot (#610)
    • Add plot_holidays (#624)
    • Add instruction about documentation formatting to contribution guide (#648)
    • Seasonal strategy in TimeSeriesImputerTransform (#639)

    Changed

    • Add logging to Metric.__call__ (#643)
    • Add in_column to plot_anomalies, plot_anomalies_interactive (#618)
    • Add logging to TSDataset.inverse_transform (#642)

    Fixed

    • Passing non default params for default models STLTransform (#641)
    • Fixed bug in SARIMAX model with horizon=1 (#637)
    • Fixed bug in models get_model method (#623)
    • Fixed unsafe comparison in plots (#611)
    • Fixed plot_trend does not work with Linear and TheilSen transforms (#617)
    • Improve computation time for rolling window statistics (#625)
    • Don't fill first timestamps in TimeSeriesImputerTransform (#634)
    • Fix documentation formatting (#636)
    • Fix bug with exog features in AutoRegressivePipeline (#647)
    • Fix missed dependencies (#656)
    • Fix custom_transform_and_model notebook (#651)
    • Fix MyBinder bug with dependencies (#650)
    Source code(tar.gz)
    Source code(zip)
  • 1.7.0(Mar 16, 2022)

    Highlights:

    • New plots (a lot!): imputation, trend, change points, residuals, qq-plot, feature relevance, stl.
    • New regressors logic in TSDatasets, Transforms and Models
    • Added jupyter notebook with regressors example
    • Prediction intervals visualization in plot_forecast
    • Detrending could be polynomial
    • Added installation instruction for M1
    • Fixed TSDataset when plot method does not plot all required segments
    • VotingEnsemble allows to set weights of estimator as weights of pipelines

    Full changelog:

    Added

    • Regressors logic to TSDatasets init (https://github.com/tinkoff-ai/etna/pull/357)
    • FutureMixin into some transforms (https://github.com/tinkoff-ai/etna/pull/361)
    • Regressors updating in TSDataset transform loops (https://github.com/tinkoff-ai/etna/pull/374)
    • Regressors handling in TSDataset make_future and train_test_split (https://github.com/tinkoff-ai/etna/pull/447)
    • Prediction intervals visualization in plot_forecast (https://github.com/tinkoff-ai/etna/pull/538)
    • Add plot_imputation (https://github.com/tinkoff-ai/etna/pull/598)
    • Add plot_time_series_with_change_points function (https://github.com/tinkoff-ai/etna/pull/534)
    • Add plot_trend (https://github.com/tinkoff-ai/etna/pull/565)
    • Add find_change_points function (https://github.com/tinkoff-ai/etna/pull/521)
    • Add option day_number_in_year to DateFlagsTransform (https://github.com/tinkoff-ai/etna/pull/552)
    • Add plot_residuals (https://github.com/tinkoff-ai/etna/pull/539)
    • Add get_residuals (https://github.com/tinkoff-ai/etna/pull/597)
    • Create PerSegmentBaseModel, PerSegmentPredictionIntervalModel (https://github.com/tinkoff-ai/etna/pull/537)
    • Create MultiSegmentModel (https://github.com/tinkoff-ai/etna/pull/551)
    • Add qq_plot (https://github.com/tinkoff-ai/etna/pull/604)
    • Add regressors example notebook (https://github.com/tinkoff-ai/etna/pull/577)
    • Create EnsembleMixin (https://github.com/tinkoff-ai/etna/pull/574)
    • Add option season_number to DateFlagsTransform (https://github.com/tinkoff-ai/etna/pull/567)
    • Create BasePipeline, add prediction intervals to all the pipelines, move parameter n_fold to forecast (https://github.com/tinkoff-ai/etna/pull/578)
    • Add stl_plot (https://github.com/tinkoff-ai/etna/pull/575)
    • Add plot_features_relevance (https://github.com/tinkoff-ai/etna/pull/579)
    • Add community section to README.md (https://github.com/tinkoff-ai/etna/pull/580)
    • Create AbstaractPipeline (https://github.com/tinkoff-ai/etna/pull/573)
    • Option "auto" to weights parameter of VotingEnsemble, enables to use feature importance as weights of base estimators (https://github.com/tinkoff-ai/etna/pull/587)

    Changed

    • Change the way ProphetModel works with regressors (https://github.com/tinkoff-ai/etna/pull/383)
    • Change the way SARIMAXModel works with regressors (https://github.com/tinkoff-ai/etna/pull/380)
    • Change the way Sklearn models works with regressors (https://github.com/tinkoff-ai/etna/pull/440)
    • Change the way FeatureSelectionTransform works with regressors, rename variables replacing the "regressor" to "feature" (https://github.com/tinkoff-ai/etna/pull/522)
    • Add table option to ConsoleLogger (https://github.com/tinkoff-ai/etna/pull/544)
    • Installation instruction (https://github.com/tinkoff-ai/etna/pull/526)
    • Update plot_forecast for multi-forecast mode (https://github.com/tinkoff-ai/etna/pull/584)
    • Trainer kwargs for deep models (https://github.com/tinkoff-ai/etna/pull/540)
    • Update CONTRIBUTING.md (https://github.com/tinkoff-ai/etna/pull/536)
    • Rename _CatBoostModel, _HoltWintersModel, _SklearnModel (https://github.com/tinkoff-ai/etna/pull/543)
    • Add logging to TSDataset.make_future, log repr of transform instead of class name (https://github.com/tinkoff-ai/etna/pull/555)
    • Rename _SARIMAXModel and _ProphetModel, make SARIMAXModel and ProphetModel inherit from PerSegmentPredictionIntervalModel (https://github.com/tinkoff-ai/etna/pull/549)
    • Update get_started section in README (https://github.com/tinkoff-ai/etna/pull/569)
    • Make detrending polynomial (https://github.com/tinkoff-ai/etna/pull/566)
    • Update documentation about transforms that generate regressors, update examples with them (https://github.com/tinkoff-ai/etna/pull/572)
    • Fix that segment is string (https://github.com/tinkoff-ai/etna/pull/602)
    • Make LabelEncoderTransform and OneHotEncoderTransform multi-segment (https://github.com/tinkoff-ai/etna/pull/554)

    Fixed

    • Fix TSDataset._update_regressors logic removing the regressors (https://github.com/tinkoff-ai/etna/pull/489)
    • Fix TSDataset.info, TSDataset.describe methods (https://github.com/tinkoff-ai/etna/pull/519)
    • Fix regressors handling for OneHotEncoderTransform and HolidayTransform (https://github.com/tinkoff-ai/etna/pull/518)
    • Fix wandb summary issue with custom plots (https://github.com/tinkoff-ai/etna/pull/535)
    • Small notebook fixes (https://github.com/tinkoff-ai/etna/pull/595)
    • Fix import Literal in plotters (https://github.com/tinkoff-ai/etna/pull/558)
    • Fix plot method bug when plot method does not plot all required segments (https://github.com/tinkoff-ai/etna/pull/596)
    • Fix dependencies for ARM (https://github.com/tinkoff-ai/etna/pull/599)
    • [BUG] nn models make forecast without inverse_transform (https://github.com/tinkoff-ai/etna/pull/541)
    Source code(tar.gz)
    Source code(zip)
  • 1.6.3(Feb 14, 2022)

    Highlights:

    • Fix for version incompatibility of scipy and statsmodels

    Full changelog:

    Fixed

    • Fixed adding unnecessary lag=1 in statistics (#523)
    • Fixed wrong MeanTransform behaviour when using alpha parameter (#523)
    • Fix processing add_noise=True parameter in datasets generation (#520)
    • Fix scipy version (#525)
    Source code(tar.gz)
    Source code(zip)
  • 1.6.2(Feb 9, 2022)

  • 1.6.1(Feb 3, 2022)

    Full changelog:

    Added

    • Allow choosing start and end in TSDataset.plot method (488)

    Changed

    • Make TSDataset.to_flatten faster (#475)
    • Allow logger percentile metric aggregation to work with NaNs (#483)

    Fixed

    • Can't make forecasting with pipelines, data with nans, and Imputers (#473)
    Source code(tar.gz)
    Source code(zip)
  • 1.6.0(Jan 28, 2022)

    Highlights:

    • New transforms for feature engineering: DifferencingTransform, OneHotEncoderTransform, LabelEncoderTransform, MADTransform.
    • New transform for feature selection: MRMRFeatureSelectionTransform.
    • Warnings in docstrings about possible look-ahead bias in case of using some transfroms.
    • Version update of sklearn, pytorch-forecasting and PytorchForecastingTransform api minor changes.
    • Fixes for SARIMAX non-default parameters.
    • TSDataset.describe method for high-level information about provided time series: % of missing values, number of segments, first and last dates and etc.

    Full changelog:

    Added

    • Method TSDataset.info (#409)
    • DifferencingTransform (#414)
    • OneHotEncoderTransform and LabelEncoderTransform (#431)
    • MADTransform (#441)
    • MRMRFeatureSelectionTransform (#439)
    • Possibility to change metric representation in backtest using Metric.name (#454)
    • Warning section in documentation about look-ahead bias (#464)
    • Parameter figsize to all the plotters #465

    Changed

    • Change method TSDataset.describe (#409)
    • Group Transforms according to their impact (#420)
    • Change the way LagTransform, DateFlagsTransform and TimeFlagsTransform generate column names (#421)
    • Clarify the behaviour of TimeSeriesImputerTransform in case of all NaN values (#427)
    • Fixed bug in title in sample_acf_plot method (#432)
    • Pytorch-forecasting and sklearn version update + some pytroch transform API changing (#445)

    Fixed

    • Add relevance_params in GaleShapleyFeatureSelectionTransform (#410)
    • Docs for statistics transforms (#441)
    • Handling NaNs in trend transforms (#456)
    • Logger fails with StackingEnsemble (#460)
    • SARIMAX parameters fix (#459)
    • [BUG] Check pytorch-forecasting models with freq > "1D" (#463)
    Source code(tar.gz)
    Source code(zip)
  • 1.5.0(Dec 24, 2021)

    Highlights:

    • We extend our family of loggers by adding S3FileLogger and LocalFileLogger. They partially duplicate behaviour of WandbLogger: you can run multiple experiments (via Optuna, HyperOpt or cutom loop as example) with different hyperparameters and transformers, save results locally or on S3 and analyze results afterwards.
    • HolidayTransfrom on the base of holidays library.
    • Bug fixies for prediction intervals - now they change after inverse_transform like target.
    • We change behaviour of fit_transform:
      • before we raised error if some timeseries ended on NaN values
      • now checking will be made only before forecasting phase, so you can fill NaNs with TimeSeriesImputerTransform and make predictions without raised errors.

    N.B.

    Special thanks to @Gewissta and his videos about timeseries analysis with ETNA library

    Full changelog:

    Added

    • Holiday Transform (#359)
    • S3FileLogger and LocalFileLogger (#372)
    • Parameter changepoint_prior_scale to ProphetModel (#408)

    Changed

    • Set strict_optional = True for mypy (#381)
    • Move checking the series endings to make_future step (#413)

    Fixed

    • Sarimax bug in future prediction with quantiles (#391)
    • Catboost version too high (#394)
    • Add sorting of classes in left bar in docs (#397)
    • nn notebook in docs (#396)
    • SklearnTransform column name generation (#398)
    • Inverse transform doesn't affect quantiles (#395)
    Source code(tar.gz)
    Source code(zip)
  • 1.4.2(Dec 9, 2021)

  • 1.4.1(Dec 9, 2021)

    • Made Model, PerSegmentModel, PerSegmentWrapper imports more convenient
    • Docs now have all neural networks models
    • Speed up _check_regressors and _merge_exog
    Source code(tar.gz)
    Source code(zip)
  • 1.4.0(Dec 3, 2021)

    Hi! In this release we have focused on speed and bug fixes.

    Added

    • ACF plot

    Changed

    • Add ts.inverse_transform as final step at Pipeline.fit method
    • Make test_ts optional in plot_forecast
    • Speed up inference for multisegment regression models
    • Speed up Pipeline._get_backtest_forecasts
    • Speed up SegmentEncoderTransform
    • Wandb Logger does not work unless pytorch is installed

    Fixed

    • Get rid of lambda in DensityOutliersTransform and get_anomalies_density
    • Fixed import in transforms
    • Pickle DTWClustering

    Removed

    • Remove TimeSeriesCrossValidation
    Source code(tar.gz)
    Source code(zip)
  • 1.3.3(Nov 24, 2021)

    Added:

    • RelevanceTable can return rank
    • GaleShapleyFeatureSelectionTransform based one Gale-Shapley algorithm
    • FilterFeaturesTransform for selecting features from TSDataset while feature engineering
    • ResampleWithDistributionTransform helps to resample features according to the other feature distribution
    • Spell checks in ci

    Changed:

    • Rename confidence interval to prediction interval, start working with quantiles instead of interval_width
    • Changed format of forecast and test dataframes in WandbLogger
    Source code(tar.gz)
    Source code(zip)
  • 1.3.2(Nov 18, 2021)

  • 1.3.1(Nov 12, 2021)

  • 1.3.0(Nov 12, 2021)

    We are happy to announce 1.3.0 version of the etna library!

    We focused on making etna even more user friendly as well as added new features.

    We have added:

    • CLI for backtesting
    • MeanSegmentEncoderTransform
    • Several feature relevance algorithms
    • TreeFeatureSelectionTransform

    We have fixed:

    • Bugs in loggers when aggregate_metrics=True
    • Bug when TSDataset did not create future if exogenous data has empty future
    • links in CLI documentation
    Source code(tar.gz)
    Source code(zip)
  • 1.3.0-alpha.0(Oct 28, 2021)

    In progress...

    In this prerelease we are testing optional dependencies. Be careful!

    Docs available at https://unstable--etna-docs.netlify.app

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0(Oct 27, 2021)

    Boom! Huge update!

    Added

    • Even more documentation
    • Even more Jupyter Notebooks with examples
    • Pipeline class, helps unite models and transforms
    • Ensemble classes, helps unite models
    • AutoRegressivePipeline
    • Add confidence intervals to pipelines, models and transforms
    • Add new Transforms
    • Add clustering methods

    Changed

    • backtest moved to Pipeline class

    Fixed

    • pandas bugs
    • TSDataset.to_dataset bug

    More in our Changelog

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0-alpha.1(Oct 18, 2021)

  • 1.2.0-alpha.0(Oct 14, 2021)

    Added

    • BinsegTrendTransform, ChangePointsTrendTransform (#87)
    • Interactive plot for anomalies (#95)
    • Examples to TSDataset methods with doctest (#92)
    • WandbLogger (#71)
    • Pipeline (#78)
    • Sequence anomalies (#96), Histogram anomalies (#79)
    • 'is_weekend' feature in DateFlagsTransform (#101)
    • Documentation example for models and note about inplace nature of forecast (#112)
    • Property regressors to TSDataset (#82)
    • Clustering (#110)
    • Outliers notebook (#123))
    • Method inverse_transform in TimeSeriesImputerTransform (#135)
    • VotingEnsemble (#150)
    • Forecast command for cli (#133)
    • MyPy checks in CI/CD and lint commands (#39)
    • TrendTransform (#139)
    • Running notebooks in ci (#134)
    • Cluster plotter to EDA (#169)
    • Pipeline.backtest method (#161, #192)
    • STLTransform class (#158)
    • NN_examples notebook (#159)
    • Example for ProphetModel (#178)
    • Instruction notebook for custom model and transform creation (#180)
    • Add inverse_transform in *OutliersTransform (#160)
    • Examples for CatBoostModelMultiSegment and CatBoostModelPerSegment (#181)

    Changed

    • Delete offset from WindowStatisticsTransform (#111)
    • Add Pipeline example in Get started notebook (#115)
    • Internal implementation of BinsegTrendTransform (#141)
    • Colorebar scaling in Correlation heatmap plotter (#143)
    • Add Correlation heatmap in EDA notebook (#144)
    • Add __repr__ for Pipeline (#151)
    • Defined random state for every test cases (#155)
    • Add confidence intervals to Prophet (#153)
    • Add confidence intervals to SARIMA (#172)

    Fixed

    • Set default value of TSDataset.head method (#170)
    • Categorical and fillna issues with pandas >=1.2 (#190)
    Source code(tar.gz)
    Source code(zip)
  • 1.1.3(Oct 8, 2021)

  • 1.1.2(Oct 8, 2021)

    Just some bug fixes:

    Changed

    • SklearnTransform out column names (#99)
    • Update EDA notebook (#96)
    • Add 'regressor_' prefix to output columns of LagTransform, DateFlagsTransform, SpecialDaysTransform, SegmentEncoderTransform

    Fixed

    • Add more obvious Exception Error for forecasting with unfitted model (#102)
    • Fix bug with hardcoded frequency in PytorchForecastingTransform (#107)
    • Bug with inverse_transform method of TimeSeriesImputerTransform (#148)
    Source code(tar.gz)
    Source code(zip)
  • 1.1.2-alpha.0(Oct 7, 2021)

    In progress... Fixing bugs

    Changed

    • SklearnTransform out column names (#99)
    • Update EDA notebook (#96)
    • Add 'regressor_' prefix to output columns of LagTransform, DateFlagsTransform, SpecialDaysTransform, SegmentEncoderTransform

    Fixed

    • Add more obvious Exception Error for forecasting with unfitted model (#102)
    • Fix bug with hardcoded frequency in PytorchForecastingTransform (#107)
    • Bug with inverse_transform method of TimeSeriesImputerTransform (#148)
    Source code(tar.gz)
    Source code(zip)
  • 1.1.0(Sep 22, 2021)

    In this release we focused on adding even more features to our library. Please meet new models and transforms:

    Added

    • MedianOutliersTransform, DensityOutliersTransform (#30)
    • Issues and Pull Request templates
    • TSDataset checks (#24, #20)
    • Pytorch-Forecasting models (#29)
    • SARIMAX model (#10)
    • Logging, including ConsoleLogger (#46)
    • Correlation heatmap plotter (#77)

    Changed

    • Backtest is fully parallel
    • New default hyperparameters for CatBoost

    Fixed

    • Documentation fixes (#55, #53, #52)
    • Solved warning in LogTransform and AddConstantTransform (#26)
    • Regressors does not have enough history bug (#35)
    • make_future(1) and make_future(2) bug
    • Fix working with 'cap' and 'floor' features in Prophet model (#62))
    • Fix saving init params for SARIMAXModel (#81)
    • Imports of nn models, PytorchForecastingTransform and Transform (#80))
    Source code(tar.gz)
    Source code(zip)
Owner
Tinkoff.AI
Tinkoff AI Center
Tinkoff.AI
Python 3.6+ toolbox for submitting jobs to Slurm

Submit it! What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps

Facebook Incubator 768 Jan 03, 2023
Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Toolkit for Building Robust ML models that generalize to unseen domains (RobustDG) Divyat Mahajan, Shruti Tople, Amit Sharma Privacy & Causal Learning

Microsoft 149 Jan 06, 2023
Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

X - Data Science Initiative 410 Dec 14, 2022
Scikit-Garden or skgarden is a garden for Scikit-Learn compatible decision trees and forests.

Scikit-Garden or skgarden (pronounced as skarden) is a garden for Scikit-Learn compatible decision trees and forests.

260 Dec 21, 2022
Implementation of the Object Relation Transformer for Image Captioning

Object Relation Transformer This is a PyTorch implementation of the Object Relation Transformer published in NeurIPS 2019. You can find the paper here

Yahoo 158 Dec 24, 2022
Solve automatic numerical differentiation problems in one or more variables.

numdifftools The numdifftools library is a suite of tools written in _Python to solve automatic numerical differentiation problems in one or more vari

Per A. Brodtkorb 181 Dec 16, 2022
CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

CorrProxies - Optimizing Machine Learning Inference Queries with Correlative Proxy Models

ZhihuiYangCS 8 Jun 07, 2022
LightGBM + Optuna: no brainer

AutoLGBM LightGBM + Optuna: no brainer auto train lightgbm directly from CSV files auto tune lightgbm using optuna auto serve best lightgbm model usin

Rishiraj Acharya 22 Dec 15, 2022
A visual dataflow programming language for sklearn

Persimmon What is it? Persimmon is a visual dataflow language for creating sklearn pipelines. It represents functions as blocks, inputs and outputs ar

Álvaro Bermejo 194 Jan 04, 2023
scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

TensorFrames (Deprecated) Note: TensorFrames is deprecated. You can use pandas UDF instead. Experimental TensorFlow binding for Scala and Apache Spark

Databricks 757 Dec 31, 2022
This is a curated list of medical data for machine learning

Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,

Andrew L. Beam 5.4k Dec 26, 2022
Microsoft 5.6k Jan 07, 2023
Customers Segmentation with RFM Scores and K-means

Customer Segmentation with RFM Scores and K-means RFM Segmentation table: K-Means Clustering: Business Problem Rule-based customer segmentation machin

5 Aug 10, 2022
Dieses Projekt ermöglicht es den Smartmeter der EVN (Netz Niederösterreich) über die Kundenschnittstelle auszulesen.

SmartMeterEVN Dieses Projekt ermöglicht es den Smartmeter der EVN (Netz Niederösterreich) über die Kundenschnittstelle auszulesen. Smart Meter werden

greenMike 43 Dec 04, 2022
A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model

A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model (Random Forest Classifier Model ) that helps the user to identify whether someone is showing positive Covid sym

Priyansh Sharma 2 Oct 06, 2022
Dive into Machine Learning

Dive into Machine Learning Hi there! You might find this guide helpful if: You know Python or you're learning it 🐍 You're new to Machine Learning You

Michael Floering 11.1k Jan 03, 2023
Neighbourhood Retrieval (Nearest Neighbours) with Distance Correlation.

Neighbourhood Retrieval with Distance Correlation Assign Pseudo class labels to datapoints in the latent space. NNDC is a slim wrapper around FAISS. N

The Learning Machines 1 Jan 16, 2022
LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch

LibTraffic is a unified, flexible and comprehensive traffic prediction library, which provides researchers with a credibly experimental tool and a convenient development framework. Our library is imp

432 Jan 05, 2023