A python library for easy manipulation and forecasting of time series.

Overview

Time Series Made Easy in Python

darts


PyPI version GitHub Workflow Status Supported versions Docker Image Version (latest by date) PyPI - Downloads GitHub Release Date

darts is a python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, and combine the predictions of several models and external regressors. Darts supports both univariate and multivariate time series and models, and the neural networks can be trained multiple time series.

Documentation

High Level Introductions

Install

We recommend to first setup a clean python environment for your project with at least python 3.6 using your favorite tool (conda, venv, virtualenv with or without virtualenvwrapper).

Once your environment is setup you can install darts using pip:

pip install 'u8darts[all]'

For more detailed install instructions you can refer to our installation guide at the end of this page.

Example Usage

Create a TimeSeries object from a Pandas DataFrame, and split it in train/validation series:

import pandas as pd
from darts import TimeSeries

df = pd.read_csv('AirPassengers.csv', delimiter=",")
series = TimeSeries.from_dataframe(df, 'Month', '#Passengers')
train, val = series.split_after(pd.Timestamp('19580101'))

The dataset used in this example can be downloaded from this link.

Fit an exponential smoothing model, and make a prediction over the validation series' duration:

from darts.models import ExponentialSmoothing

model = ExponentialSmoothing()
model.fit(train)
prediction = model.predict(len(val))

Plot:

import matplotlib.pyplot as plt

series.plot(label='actual')
prediction.plot(label='forecast', lw=2)
plt.legend()
plt.xlabel('Year')
darts forecast example

We invite you to go over the example and tutorial notebooks in the examples directory.

Features

Currently, the library contains the following features:

Forecasting Models:

  • Exponential smoothing,
  • ARIMA & auto-ARIMA,
  • Facebook Prophet,
  • Theta method,
  • FFT (Fast Fourier Transform),
  • Recurrent neural networks (vanilla RNNs, GRU, and LSTM variants),
  • Temporal convolutional network.
  • Transformer
  • N-BEATS

Data processing: Tools to easily apply (and revert) common transformations on time series data (scaling, boxcox, …)

Metrics: A variety of metrics for evaluating time series' goodness of fit; from R2-scores to Mean Absolute Scaled Error.

Backtesting: Utilities for simulating historical forecasts, using moving time windows.

Regressive Models: Possibility to predict a time series from several other time series (e.g., external regressors), using arbitrary regressive models

Multivariate Support: Tools to create, manipulate and forecast multivariate time series.

Contribute

The development is ongoing, and there are many new features that we want to add. We welcome pull requests and issues on GitHub.

Before working on a contribution (a new feature or a fix) make sure you can't find anything related in issues. If there is no on-going effort on what you plan to do then we recommend to do the following:

  1. Create an issue, describe how you would attempt to solve it, and if possible wait for a discussion.
  2. Fork the repository.
  3. Clone the forked repository locally.
  4. Create a clean python env and install requirements with pip: pip install -r requirements/dev-all.txt
  5. Create a new branch:
    • Branch off from the develop branch.
    • Prefix the branch with the type of update you are making:
      • feature/
      • fix/
      • refactor/
    • Work on your update
  6. Check that your code passes all the tests and design new unit tests if needed: ./gradlew unitTest_all.
  7. Verify your tests coverage by running ./gradlew coverageTest
    • Additionally you can generate an xml report and use VSCode Coverage gutter to identify untested lines with ./coverage.sh xml
  8. If your contribution introduces a significant change, add it to CHANGELOG.md under the "Unreleased" section.
  9. Create a pull request from your new branch to the develop branch.

Contact Us

If what you want to tell us is not a suitable github issue, feel free to send us an email at [email protected] for darts related matters or [email protected] for any other inquiries.

Installation Guide

Preconditions

Some of the models depend on fbprophet and torch, which have non-Python dependencies. A Conda environment is thus recommended because it will handle all of those in one go.

The following steps assume running inside a conda environment. If that's not possible, first follow the official instructions to install fbprophet and torch, then skip to Install darts

To create a conda environment for Python 3.7 (after installing conda):

conda create --name <env-name> python=3.7

Don't forget to activate your virtual environment

conda activate <env-name>

MAC

conda install -c conda-forge -c pytorch pip fbprophet pytorch

Linux and Windows

conda install -c conda-forge -c pytorch pip fbprophet pytorch cpuonly

Install darts

Install Darts with all available models: pip install 'u8darts[all]'.

As some models have relatively heavy (or non-Python) dependencies, we also provide the following alternate lighter install options:

  • Install core only (without neural networks, Prophet or AutoARIMA): pip install u8darts
  • Install core + neural networks (PyTorch): pip install 'u8darts[torch]'
  • Install core + Facebook Prophet: pip install 'u8darts[fbprophet]'
  • Install core + AutoARIMA: pip install 'u8darts[pmdarima]'

Running the examples only, without installing:

If the conda setup is causing too many problems, we also provide a Docker image with everything set up for you and ready-to-use python notebooks with demo examples. To run the example notebooks without installing our libraries natively on your machine, you can use our Docker image:

./gradlew docker && ./gradlew dockerRun

Then copy and paste the URL provided by the docker container into your browser to access Jupyter notebook.

For this setup to work you need to have a Docker service installed. You can get it at Docker website.

Tests

The gradle setup works best when used in a python environment, but the only requirement is to have pip installed for Python 3+

To run all tests at once just run

./gradlew test_all

alternatively you can run

./gradlew unitTest_all # to run only unittests
./gradlew coverageTest # to run coverage
./gradlew lint         # to run linter

To run the tests for specific flavours of the library, replace _all with _core, _fbprophet, _pmdarima or _torch.

Documentation

To build documantation locally just run

./gradlew buildDocs

After that docs will be available in ./docs/build/html directory. You can just open ./docs/build/html/index.html using your favourite browser.

Comments
  • [BUG]: 'numpy.ndarray' object has no attribute 'get_color' while plotting NaiveSessional() forecast

    [BUG]: 'numpy.ndarray' object has no attribute 'get_color' while plotting NaiveSessional() forecast

    Describe the bug A clear and concise description of what the bug is.

    While plotting a naive forecast an attributeError occured: 'numpy.ndarray object has no attribute get_color'

    To Reproduce Steps to reproduce the behavior, preferably code snippet. I have a subset of len 6188 'train' on which I trained the darts.NaiveSessional model.

    naive_model = models.NaiveSeasonal(K=1) naive_model.fit(train) naive_forecast = naive_model.predict(1) naive_forecast.plot(label='Forecast')


    AttributeError Traceback (most recent call last) c:\Users\Itcomplex\Documents\scripts\Predictive Analytics\Work\crypto_value_prediction_hourly.ipynb Cell 25' in <cell line: 4>() 2 naive_model.fit(train) 3 naive_forecast = naive_model.predict(1) ----> 4 naive_forecast.plot(label='Forecast')

    File c:\Users\Itcomplex\anaconda3\envs\env\lib\site-packages\darts\timeseries.py:2450, in TimeSeries.plot(self, new_plot, central_quantile, low_quantile, high_quantile, *args, **kwargs) 2447 kwargs["label"] = label_to_use 2449 p = central_series.plot(*args, **kwargs) -> 2450 color_used = p[0].get_color() 2451 kwargs["alpha"] = alpha if alpha is not None else alpha_confidence_intvls 2453 # Optionally show confidence intervals

    AttributeError: 'numpy.ndarray' object has no attribute 'get_color'

    Expected behavior A clear and concise description of what you expected to happen.

    Should have plotted the predicted timeseries.

    System (please complete the following information):

    • Python version: [e.g. 3.9.2]
    • darts version [e.g. 0.14.0]

    Additional context Add any other context about the problem here.

    bug 
    opened by iamMaverick 16
  • [BUG] Training never starts on TFT

    [BUG] Training never starts on TFT

    Describe the bug I use a dataset composed of 20 features and a single target. All of the features are future covariates. I use target past as well as the features's history as past covariates. To covariates, I add datetime attributes of year, month, day of week, hour, and holidays. The dataset has several years of hourly data, however I tried cutting down the samples to check if it made a difference. I am succesfully using the same dataset on other models (not from DARTS) and getting good results.

    To Reproduce

    train_ratio = 0.90
    look_back = 192
    horizon = 192
    n_outputs = 1
    
    df = pd.read_csv(file_path, index_col = 0)
    
    training_cutoff = pd.Timestamp(df['Time'].iloc[round(len(df)*train_ratio)])
    series = TimeSeries.from_dataframe(df, 'Time', value_cols = df.columns[1:])
    train, val = series.split_after(training_cutoff)
    
    scaler = Scaler()
    train_transformed = scaler.fit_transform(train)
    val_transformed = scaler.transform(val)
    series_transformed = scaler.transform(series)
    
    trgt_scaler = Scaler()
    trgt_transformed = trgt_scaler.fit_transform(series['target'])
    
    covariates = datetime_attribute_timeseries(series, attribute='year', one_hot=False)
    covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='month', one_hot=False))
    covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='day_of_week', one_hot=False))
    covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='hour', one_hot=False))
    covariates = covariates.add_holidays(country)
    f_covariates = covariates.stack(TimeSeries.from_times_and_values(times=series.time_index, 
                                                                   values=df.iloc[:, 1+n_outputs:].to_numpy(), 
                                                                   columns=series.columns[n_outputs:]))
    p_covariates = covariates.stack(TimeSeries.from_times_and_values(times=series.time_index, 
                                                                   values=df.iloc[:, 1:].to_numpy(), 
                                                                   columns=series.columns))
    
    scaler_f_covs = Scaler()
    f_cov_train, f_cov_val = f_covariates.split_after(training_cutoff)
    scaler_f_covs.fit(f_cov_train)
    f_covariates_transformed = scaler_f_covs.transform(f_covariates)
    
    scaler_p_covs = Scaler()
    p_cov_train, p_cov_val = p_covariates.split_after(training_cutoff)
    scaler_p_covs.fit(p_cov_train)
    p_covariates_transformed = scaler_p_covs.transform(p_covariates)
    
    quantiles = [
         0.1, 0.25, 0.5, 0.75, 0.9
    ]
    model = TFTModel(input_chunk_length=look_back,
                        output_chunk_length=horizon,
                        hidden_size=32,
                        lstm_layers=1,
                        full_attention = True,
                        dropout = 0.1,
                        num_attention_heads=4,
                        batch_size=32,
                        n_epochs=250,
                        add_relative_index=False,
                        add_encoders=None,
                        #likelihood=None,
                        #loss_fn=MSELoss(),
                        likelihood=QuantileRegression(quantiles=quantiles),  # QuantileRegression is set per default
                        force_reset=True,
                        pl_trainer_kwargs = {"accelerator": "gpu", "gpus": [0], 
                                             "enable_progress_bar" : True, "enable_model_summary" : True},
                        optimizer_cls = torch.optim.SGD,
                        optimizer_kwargs = {'lr':0.01})
    
    model.fit(train_transformed['target'],
             future_covariates=f_covariates_transformed,
             past_covariates=p_covariates_transformed)
    
    
    

    Expected behavior Training starts but it gets stuck. It never ends a single epoch.

    System:

    • Python version: [ 3.9]
    • darts version [ 0.17.0]
    bug triage 
    opened by strakehyr 16
  • Windows installation

    Windows installation

    Hello, when installing darts libary using conda as per instructions with python 3.7 i get the following error:

    
    PS C:\Users\XXXX> conda install -c conda-forge u8darts-all
    Collecting package metadata (current_repodata.json): done
    Solving environment: failed with initial frozen solve. Retrying with flexible solve.
    Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
    Collecting package metadata (repodata.json): done
    Solving environment: failed with initial frozen solve. Retrying with flexible solve.
    Solving environment: -
    Found conflicts! Looking for incompatible packages.
    This can take several minutes.  Press CTRL-C to abort.
    failed
    
    UnsatisfiableError: The following specifications were found to be incompatible with each other:
    
    Output in format: Requested package -> Available versions
    

    Do you have any ideas why this is happening?

    I can install prophet on it's own but not the darts libary.

    Thanks,

    Adam

    opened by adschem 16
  • TypeError: <class 'numpy.typing._dtype_like._SupportsDType'> is not a generic class]

    TypeError: is not a generic class]

    Describe the bug A clear and concise description of what the bug is.

    To Reproduce Steps to reproduce the behavior, preferably code snippet.

    Expected behavior A clear and concise description of what you expected to happen.

    System (please complete the following information):

    • Python version: [e.g. 3.7]
    • darts version [e.g. 0.14.0]

    Additional context Add any other context about the problem here.

    bug 
    opened by Mosw5871 15
  • Feat/format isort, linting, and pyupgrade

    Feat/format isort, linting, and pyupgrade

    Fixes #739

    Summary

    Extending the pre-commit linting.

    • isort
    • end-of-file-fixer
    • trailing-whitespace
    • check-added-large-files
    • requirements-txt-fixer
    • mixed-line-ending
    • pyupgrade

    each addition had its own commit to see what changed

    TODO:

    • [x] Add .git-blame-ignore-revs file for commit #765
    opened by gdevos010 15
  • Natural gaps in timeseries observations

    Natural gaps in timeseries observations

    I have hourly energy observations taken during business days only. So 120 hrs per week, not 168. With Sat and Sun missing always and holidays as well. Seasonality is daily, weekly, yearly.

    Was trying to follow samples and use TimeSeries.from_dataframe with default settings. I got a lot of NaNs inserted into DateTimeIndex, that matches pandas.asfreq('H') behaviour. So with train/test split train, val = series.split_before(pd.Timestamp('20200101')) I receive

    len(data[:'20200101']), len(train), len(data['20200101':]), len(val)
    (11856, 17328, 5904, 9480)
    
    data['20200101':]['load']
    dt_iso
    2020-01-09 00:00:00     801.0410
    2020-01-09 01:00:00     790.4990
    2020-01-09 02:00:00     770.1160
    2020-01-09 03:00:00     770.8910
    2020-01-09 04:00:00     774.4680
                             ...    
    2021-01-29 19:00:00   1,026.1950
    2021-01-29 20:00:00   1,007.2650
    2021-01-29 21:00:00     990.8280
    2021-01-29 22:00:00     953.9190
    2021-01-29 23:00:00     904.5980
    Name: load, Length: 5904, dtype: float64
    
    val['load']
                              load
    date                          
    2020-01-01 00:00:00        nan
    2020-01-01 01:00:00        nan
    2020-01-01 02:00:00        nan
    2020-01-01 03:00:00        nan
    2020-01-01 04:00:00        nan
    ...                        ...
    2021-01-29 19:00:00 1,026.1950
    2021-01-29 20:00:00 1,007.2650
    2021-01-29 21:00:00   990.8280
    2021-01-29 22:00:00   953.9190
    2021-01-29 23:00:00   904.5980
    
    [9480 rows x 1 columns]
    Freq: H
    

    So one can see that data has exploded with NaNs.

    Obviously darts.utils.statistics.plot_acf() and a darts.utils.statistics.check_seasonality() do not work with NaNs. plot_acf() gets me a straight line at zero, where should be AR lags up until 192. check_seasonality() reports [2021-03-05 17:39:16,120] INFO | darts.utils.statistics | The ACF has no local maximum for m < max_lag = 24.

    If I supplyfill_missing_dates=False for the TimeSeries.from_dataframe(): series = TimeSeries.from_dataframe(data, time_col='date', value_cols=['load'], fill_missing_dates=False) I get Could not infer frequency. Are some dates missing? Try specifying 'fill_missing_dates=True'

    With freq='H' parameter to function above no luck also. The same loophole as with pandas DateTimeIndex with freq='H'

    In statsmodels Sarimax models I was able to overcome datetime freq warning by converting index to PeriodIndex which supports gaps. data.index = pd.DatetimeIndex(data.index).to_period('H')

    Please advise what are my options with Darts to be able to work with time series data with natural gaps?

    Thanks a lot in advance for the attention.

    P.S. Some pictures to illustrate time series and gaps Capture-1 Capture

    feature request triage 
    opened by rmk17 15
  • Proposal for preprocessing pipeline

    Proposal for preprocessing pipeline

    Follow up on our discussion on preprocessing.

    BaseTransformer stay the same as was shown - but of course please raise again any concerns/improvement. transform in Timeseries also - again please raise any concerns/improvement.

    New class Pipeline for managing transforms and validations. Pipeline is intended only for one object of timeseries - this way when we call inverse we are sure that state of transformer is for correct series. Pipeline can be traversed forward and maybe backward (sometimes transform may not have inverse we may have to be extra careful about it)

    Interface of pipeline:

    add_transformer - function that add transformer to pipeline, order of transformers are determined as an order that transformers were added,
    add_validator - function that adds validator for each step
    resolve - function that resolves pipeline (traverse path of transformations to the end either forward or backward
    resolve_n_steps - same as resolve but traverse n step
    set_data/get_data -setter/getter for data - getter would return data that we have at our current step of transformations
    

    I left out implementations details as I feel like they are more appropriate to do after proposal in some form will be accepted.

    Ping @TheMP @hrzn @endrjuskr @pennfranc @guillaumeraille @radujica @Droxef. I hope I included all interested.

    opened by Kostiiii 15
  • [BUG] Timeseries.from_dataframe() does not handle localized datetime64 index correctly anymore

    [BUG] Timeseries.from_dataframe() does not handle localized datetime64 index correctly anymore

    Thank you for keeping up the great work on darts!

    Describe the bug In 0.9 with xarray backend of Timeseries, localized datetimeindex is not handled correctly (anymore). Using the toy example from another issue, you can see that plain datetimeindex is converted datetime64 in Timeseries. Once localized to UTC, the index is converted to object in Timeseries leading to downstream problems in plot(), etc.

    To reproduce

    df = pd.DataFrame(data={'First': [0,1,2,3,4,4.5,4,3,2,1]})
    inds = [f'2021-01-{day}' for day in range(1, 11)]
    df.index = pd.to_datetime(inds)
    
    df_localized = df.copy()
    df_localized.index = df.index.tz_localize('UTC')
    
    ts = TimeSeries.from_dataframe(df)
    ts_localized = TimeSeries.from_dataframe(df_localized)
    

    Expected behaviour Localized datetimeindex is converted to datetime64 in Timeseries.

    System Python version: Python 3.9.5 darts install: pip install u8darts

    opened by schweima 14
  • window transformer

    window transformer

    In addition to "lag features" which we already have, it'd be nice to add "window features", specifying window characteristics and corresponding function(s) to apply to create features dynamically in regression models. For instance, it is often helpful to use the trailing mean and variance of the last N points as features. We could also imagine having a way to have fairly generic windows (e.g., "last month", "last week", "the N points starting N-k time steps ago", etc...

    improvement 
    opened by hrzn 12
  • [Feature request] More transformer-based model (i.e. Informer)

    [Feature request] More transformer-based model (i.e. Informer)

    I noticed that a TransformerModel is now in darts.

    I am wondering if I can help add more transformer-based models (i.e. Informer) to darts. The new transformer-based time-series-forecasting models often have a better performance compared with traditional models, and I believe it will help if Informer is added to darts.

    BTW, I am also wondering if it is necessary to change the TransformerModel to an another alternative name (for example, NativeTransformerModel, to reduce misunderstanding because nowadays the name transformer is more often used a structure or component of neural network than a model.

    triage 
    opened by IncubatorShokuhou 12
  • Training with common covariate for multiple timeseries

    Training with common covariate for multiple timeseries

    Hello,

    I want to train a global forecasting model for energy production from several generators. The generators are all in roughly the same geographical area (loc1), so I will assume that they share climate feature (common_climate_covariate_loc1). So I want to fit 25 generator TimeSeries to the model, and include a common covariate TimeSeries.

    Will this be the right way to approach this problem?

    model_cov.fit(series=[gen1_loc1, gen2_loc1, ... , genX_loc1], past_covariates=[common_climate_covariate_loc1], verbose=True)

    where: gen1_loc1 is generator1 from location 1 and so on...

    and later on, if I want to train the same model with 25 other generators that are from a different area (loc2) and have different climate data, do I just repeat the process and fit the new data to the same model?

    model_cov.fit(series=[gen_loc2_1, gen_loc2_2, ... , gen_loc3_X], past_covariates=[common_climate_covariate_loc2], verbose=True)

    where: gen1_loc2 is generator1 from location 2 and so on...

    Will this be the correct procedure? I hope I made the problem clear, thanks in advance for response.

    opened by Sigvesor 12
  • Loess Filter that is Vectorised Over `samples` and `components` Axes

    Loess Filter that is Vectorised Over `samples` and `components` Axes

    Is your feature request related to a current problem? Please describe.

    From what I understand, seasonal decomposition is currently achieved in darts by calling extract_trend_and_seasonality. The limitation of this, however, is that extract_trend_and_seasonality only works for univariate and single sample series, since the statsmodels function(s) called by extract_trend_and_seasonality only work for one-dimensional arrays; more specifically, an error is thrown by darts when series.pd_series() is called inside extract_trend_and_seasonality if series has multiple samples and/or components.

    In order to seasonally decompose multiple samples/components of a series, therefore, one must manually loop over each component/sample and pass it to extract_trend_and_seasonality; this potentially adds quite a bit of overhead if the series contains lots of components and/or samples.

    It would be cool if there was an STL decomposer in darts that's vectorised over the samples and components axes of the series input. A 'first step' towards this goal would be to implement a vectorized Loess filter, since this forms the 'backbone' of the STL algorithm.

    Describe proposed solution

    I'd suggest adding a loess_filter function somewhere inside of darts; it should probably have a similar call signature to the statsmodel implementation , so something like:

    loess_filter(series, frac, it, delta)
    

    although I'd probably err on the side of changing the names of frac, it, and delta, since these aren't very descriptive. The is_sorted and return_sorted arguments of the statsmodel function can obviously be dropped for the darts implementation, as well as the missing argument (since TimeSeries are assumed to 'contain numeric types only').

    Some potential locations for this function could be darts.utils.statistics, darts.models.filtering, or darts.dataprocessing.transformers. Personally, I'm pretty indifferent as to where the function is placed, so any suggestions around this would be welcome.

    The critical point to note about this loess_filter function, however, is that it should be vectorized over the samples and components axes of series.

    Describe potential alternatives

    Simply looping over each sample and component in a series, passing each to extract_trend_and_seasonality. For series with many samples and components, I imagine this may be somewhat prohibitive.

    Additional context

    I've been playing around with implementing a vectorised loess filter in my spare time, so I'll post a PR with what I've done thus far soon, but any comments and/or suggestions in the mean time would be appreciated.

    Cheers, Matt.

    triage 
    opened by mabilton 0
  • [Question] Can we train and forecast given incomplete `input_chunk_length` history ?

    [Question] Can we train and forecast given incomplete `input_chunk_length` history ?

    Hi guys! First of all, thank you for your fantastic work on unifying time series forecasting!

    I got stuck - after browsing examples and documentation, I wonder whether there exists an explicit (or implicit) way to forecast output_chunk_length given incomplete historical data, to deal with cold-start problems? It might be beneficial for both train and inference for many industries to have such functionality covered

    Consider the following example, let us have:

    • input_chunk_length = 20 (maximum historical window to train on)
    • output_chunk_length = 10 (desired forecast horizon)
    • we have static_covariates, past_covariates (of maximum length 10), future covariates (past and future)
    • 2 timeseries of length 15 (so it's less than input_chunk_length + output_chunk_length = 30 but it's strictly longer than output_chunk_length)
    1. how can we train model by sampling chunks of [19 masked rows, 1st historical row] to predict [2, 3, ..., 11] timestamps of this timeseries, ..., up to [15 masked rows , 1st historical row, ..., 5th historical row] to predict [6, 7, ..., 15] rows of this timeseries?

    2. after trained model learn how to predict each timestep of timeseries starting from 2nd and beyond (given 1st and some masked rows as an input, so we can deal with cold start), how can we predict from such incomplete historical data?

    Right now, I can't imagine any other way to do it with darts framework, rather than add fake start to timeseries consisting of (input_chunk_length - 1) fake target rows, the same fake start addition for past_covariates data The bad thing is, I can't explicitly use NaNs here for neural-network based models, so that's not masking and that will affect the loss and, as a result, bias the model

    Any suggestions / future development plans that include such workaround will be greatly appreciated!

    opened by fred-navruzov 0
  • Timeseries classification

    Timeseries classification

    Is your feature request related to a current problem? Please describe. Timeseries classification is not feasible in Darts, IoT has excellent data quality and interesting business cases, we've used Darts many times for regression achieving great results in short time, classification should be a feature in the roadmap since its becoming more important each day.

    Describe proposed solution Being the head of the neural the network, I propose creating a subset of models that can tackle classification, RandomForest and Regresinos with scikit learn could raise a first a opportunity to have time series classification.

    Describe potential alternatives As a first approach Darts could take this as a regression problem with a distribution of probability between 0 and 1, and using a function like softmax to adjust to it to a class.

    Additional context

    triage 
    opened by bgonzalezfractal 0
  • Trainer / optimizer / logger reinit feature for Torch models

    Trainer / optimizer / logger reinit feature for Torch models

    Is your feature request related to a current problem? Please describe. I am trying to load a previously trained NHiTS model and "finetune" it on a new dataset. It entails changing the tensorboard logger, the resetting and changing the LR for the optimizer, removing LR scheduler, but KEEPING THE WEIGHTS, and continuing with a small learning rate some more epochs on new data.

    It was a considerable struggle to get it to work, and I did a super hacky and ugly code.

    https://gist.github.com/solalatus/e6329879c5c9479ce60af9f6b0e22bb4

    Describe proposed solution Provide reinit_training() or something alike function that allows for the modification of the trainer / hyperparams of Torch based models, keeping the weights intact, but allowing new training runs eg. for finetuning on new data with different optimizers and learning rates.

    Describe potential alternatives A clear and concise description of any alternative solutions or existing features that might solve it.

    Additional context General context is transfer learning WITH FINETUNING, so not the widspreadly documented zero shot scenario: I have a big model trained on a large dataset, I want to finetune it on my small new dataset but with very small learning rate so as not to suffer catastrophical forgetting.

    triage 
    opened by solalatus 0
  • Bug fix - correct for multiple lengths of series in input sequence when adding static covariates to regression models

    Bug fix - correct for multiple lengths of series in input sequence when adding static covariates to regression models

    Fixes #1460

    Summary

    In the original version for the code adding static covariates to regression models, it was assumed that the series are of same length. This PR corrects this assumption and allows to support cases where the series in the input sequence have different sizes.

    Other Information

    None

    opened by eliane-maalouf 3
Releases(0.23.0)
Owner
Unit8
Solving your most impactful problems via Big Data & AI
Unit8
Implementation of different ML Algorithms from scratch, written in Python 3.x

Implementation of different ML Algorithms from scratch, written in Python 3.x

Gautam J 393 Nov 29, 2022
A collection of video resources for machine learning

Machine Learning Videos This is a collection of recorded talks at machine learning conferences, workshops, seminars, summer schools, and miscellaneous

Dustin Tran 1.5k Dec 29, 2022
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Seldon Core: Blazing Fast, Industry-Ready ML An open source platform to deploy your machine learning models on Kubernetes at massive scale. Overview S

Seldon 3.5k Jan 01, 2023
YouTube Spam Detection with python

YouTube Spam Detection This code deletes spam comment on youtube videos based on two characteristics (currently) If the author of the comment has a se

MohamadReza Taalebi 5 Sep 27, 2022
A Python toolkit for rule-based/unsupervised anomaly detection in time series

Anomaly Detection Toolkit (ADTK) Anomaly Detection Toolkit (ADTK) is a Python package for unsupervised / rule-based time series anomaly detection. As

Arundo Analytics 888 Dec 30, 2022
Machine Learning approach for quantifying detector distortion fields

DistortionML Machine Learning approach for quantifying detector distortion fields. This project is a feasibility study for training a surrogate model

Joel Bernier 1 Nov 05, 2021
pure-predict: Machine learning prediction in pure Python

pure-predict speeds up and slims down machine learning prediction applications. It is a foundational tool for serverless inference or small batch prediction with popular machine learning frameworks l

Ibotta 84 Dec 29, 2022
The project's goal is to show a real world application of image segmentation using k means algorithm

The project's goal is to show a real world application of image segmentation using k means algorithm

2 Jan 22, 2022
A Python library for choreographing your machine learning research.

A Python library for choreographing your machine learning research.

AI2 270 Jan 06, 2023
Provide an input CSV and a target field to predict, generate a model + code to run it.

automl-gs Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learn

Max Woolf 1.8k Jan 04, 2023
A simple example of ML classification, cross validation, and visualization of feature importances

Simple-Classifier This is a basic example of how to use several different libraries for classification and ensembling, mostly with sklearn. Example as

Rob 2 Aug 25, 2022
A Python package to preprocess time series

Disclaimer: This package is WIP. Do not take any APIs for granted. tspreprocess Time series can contain noise, may be sampled under a non fitting rate

Maximilian Christ 57 Dec 17, 2022
About Solve CTF offline disconnection problem - based on python3's small crawler

About Solve CTF offline disconnection problem - based on python3's small crawler, support keyword search and local map bed establishment, currently support Jianshu, xianzhi,anquanke,freebuf,seebug

天河 32 Oct 25, 2022
Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Mert Sezer Ardal 1 Jan 31, 2022
The code from the Machine Learning Bookcamp book and a free course based on the book

The code from the Machine Learning Bookcamp book and a free course based on the book

Alexey Grigorev 5.5k Jan 09, 2023
Adversarial Framework for (non-) Parametric Image Stylisation Mosaics

Fully Adversarial Mosaics (FAMOS) Pytorch implementation of the paper "Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Imag

Zalando Research 120 Dec 24, 2022
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah) Xue 0 Jan 31, 2022
We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

Ayşe Nur Türkaslan 9 Oct 14, 2022
Formulae is a Python library that implements Wilkinson's formulas for mixed-effects models.

formulae formulae is a Python library that implements Wilkinson's formulas for mixed-effects models. The main difference with other implementations li

34 Dec 21, 2022
#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

30 Days Of Streamlit 🎈 This is the official repo of #30DaysOfStreamlit — a 30-day social challenge for you to learn, build and deploy Streamlit apps.

Streamlit 53 Jan 02, 2023