Lightning ⚡️ fast forecasting with statistical and econometric models.

Overview

Nixtla   Tweet  Slack

All Contributors

Statistical ⚡️ Forecast

Lightning fast forecasting with statistical and econometric models

CI Python PyPi conda-nixtla License: MIT docs

StatsForecast offers a collection of widely used univariate time series forecasting models, including exponential smoothing and automatic ARIMA modeling optimized for high performance using numba.

🔥 Highlights

  • Fastest and most accurate auto_arima in Python and R.
  • New!: Replace FB-Prophet in two lines of code and gain speed and accuracy. Check the experiments here.
  • New!: Distributed computation in clusters with ray. (Forecast 1M series in 30min)
  • New!: Good Ol' sklearn syntax with AutoARIMA().fit(y).predict(h=7).

🎊 Features

  • Inclusion of exogenous variables and prediction intervals.
  • Out of the box implementation of exponential smoothing, croston, seasonal naive, random walk with drift and tbs.
  • 20x faster than pmdarima.
  • 1.5x faster than R.
  • 500x faster than Prophet.
  • Compiled to high performance machine code through numba.
  • 1,000,000 series in 30 min with ray.

Missing something? Please open an issue or write us in Slack

📖 Why?

Current Python alternatives for statistical models are slow and inaccurate. So we created a library that can be used to forecast in production environments or as benchmarks. StatsForecast includes an extensive battery of models that can efficiently fit thousands of time series.

🔬 Accuracy

We compared accuracy and speed against: pmdarima, Rob Hyndman's forecast package and Facebook's Prophet. We used the Daily, Hourly and Weekly data from the M4 competition.

The following table summarizes the results. As can be seen, our auto_arima is the best model in accuracy (measured by the MASE loss) and time, even compared with the original implementation in R.

dataset metric nixtla pmdarima [1] auto_arima_r prophet
M4-Daily MASE 3.26 3.35 4.46 14.26
M4-Daily time 1.41 27.61 1.81 514.33
M4-Hourly MASE 0.92 --- 1.02 1.78
M4-Hourly time 12.92 --- 23.95 17.27
M4-Weekly MASE 2.34 2.47 2.58 7.29
M4-Weekly time 0.42 2.92 0.22 19.82

[1] The model auto_arima from pmdarima had problems with Hourly data. An issue was opened in their repo.

The following table summarizes the data details.

group n_series mean_length std_length min_length max_length
Daily 4,227 2,371 1,756 107 9,933
Hourly 414 901 127 748 1,008
Weekly 359 1,035 707 93 2,610

Computational efficiency

We measured the computational time against the number of time series. The following graph shows the results. As we can see, the fastest model is our auto_arima.

Nixtla vs Prophet

You can reproduce the results here.

External regressors

Results with external regressors are qualitatively similar to the reported before. You can find the complete experiments here.

👾 Less code

pmd to stats

📖 Documentation

Here is a link to the documentation.

🧬 Getting Started Open In Colab

Example Jupyter Notebook

💻 Installation

PyPI

You can install the released version of StatsForecast from the Python package index with:

pip install statsforecast

(Installing inside a python virtualenvironment or a conda environment is recommended.)

Conda

Also you can install the released version of StatsForecast from conda with:

conda install -c conda-forge statsforecast

(Installing inside a python virtualenvironment or a conda environment is recommended.)

Dev Mode If you want to make some modifications to the code and see the effects in real time (without reinstalling), follow the steps below:
git clone https://github.com/Nixtla/statsforecast.git
cd statsforecast
pip install -e .

🧬 How to use

The following example needs ipython and matplotlib as additional packages. If not installed, install it via your preferred method, e.g. pip install ipython matplotlib.

import numpy as np
import pandas as pd
from IPython.display import display, Markdown

import matplotlib.pyplot as plt
from statsforecast import StatsForecast
from statsforecast.models import seasonal_naive, auto_arima
from statsforecast.utils import AirPassengers
horizon = 12
ap_train = AirPassengers[:-horizon]
ap_test = AirPassengers[-horizon:]
series_train = pd.DataFrame(
    {
        'ds': pd.date_range(start='1949-01-01', periods=ap_train.size, freq='M'),
        'y': ap_train
    },
    index=pd.Index([0] * ap_train.size, name='unique_id')
)
fcst = StatsForecast(
    series_train, 
    models=[(auto_arima, 12), (seasonal_naive, 12)], 
    freq='M', 
    n_jobs=1
)
forecasts = fcst.forecast(12, level=(80, 95))
forecasts['y_test'] = ap_test
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
df_plot = pd.concat([series_train, forecasts]).set_index('ds')
df_plot[['y', 'y_test', 'auto_arima_season_length-12_mean', 'seasonal_naive_season_length-12']].plot(ax=ax, linewidth=2)
ax.fill_between(df_plot.index, 
                df_plot['auto_arima_season_length-12_lo-80'], 
                df_plot['auto_arima_season_length-12_hi-80'],
                alpha=.35,
                color='green',
                label='auto_arima_level_80')
ax.fill_between(df_plot.index, 
                df_plot['auto_arima_season_length-12_lo-95'], 
                df_plot['auto_arima_season_length-12_hi-95'],
                alpha=.2,
                color='green',
                label='auto_arima_level_95')
ax.set_title('AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()
for label in (ax.get_xticklabels() + ax.get_yticklabels()):
    label.set_fontsize(20)

png

Adding external regressors

series_train['trend'] = np.arange(1, ap_train.size + 1)
series_train['intercept'] = np.ones(ap_train.size)
series_train['month'] = series_train['ds'].dt.month
series_train = pd.get_dummies(series_train, columns=['month'], drop_first=True)
display_df(series_train.head())
unique_id ds y trend intercept month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12
0 1949-01-31 00:00:00 112 1 1 0 0 0 0 0 0 0 0 0 0 0
0 1949-02-28 00:00:00 118 2 1 1 0 0 0 0 0 0 0 0 0 0
0 1949-03-31 00:00:00 132 3 1 0 1 0 0 0 0 0 0 0 0 0
0 1949-04-30 00:00:00 129 4 1 0 0 1 0 0 0 0 0 0 0 0
0 1949-05-31 00:00:00 121 5 1 0 0 0 1 0 0 0 0 0 0 0
xreg_test = pd.DataFrame(
    {
        'ds': pd.date_range(start='1960-01-01', periods=ap_test.size, freq='M')
    },
    index=pd.Index([0] * ap_test.size, name='unique_id')
)
xreg_test['trend'] = np.arange(133, ap_test.size + 133)
xreg_test['intercept'] = np.ones(ap_test.size)
xreg_test['month'] = xreg_test['ds'].dt.month
xreg_test = pd.get_dummies(xreg_test, columns=['month'], drop_first=True)
fcst = StatsForecast(
    series_train, 
    models=[(auto_arima, 12), (seasonal_naive, 12)], 
    freq='M', 
    n_jobs=1
)
forecasts = fcst.forecast(12, xreg=xreg_test, level=(80, 95))
forecasts['y_test'] = ap_test

🔨 How to contribute

See CONTRIBUTING.md.

📃 References

  • The auto_arima model is based (translated) from the R implementation included in the forecast package developed by Rob Hyndman.

Contributors

Thanks goes to these wonderful people (emoji key):


fede

💻

José Morales

💻 🚧

Sugato Ray

💻

Jeff Tackes

🐛

darinkist

🤔

Alec Helyar

💬

Dave Hirschfeld

💬

mergenthaler

💻

Kin

💻

Yasslight90

🤔

asinig

🤔

Philip Gillißen

💻

This project follows the all-contributors specification. Contributions of any kind welcome!

Comments
  • Adding `settings.ini` to PyPI source

    Adding `settings.ini` to PyPI source

    The setup.py file reads in the config values from settings.ini. So, absence of settings.ini file in the source distribution (*.tar.gz file), leads to failure in installation of statsforecast.

    • [x] Currently (v0.3.0) the PyPI source does not include the settings.ini file. This PR fixes that.
      image
    • [ ] ~~Changes to README.md file~~:
      • [ ] ~~Fixed some of the formatting errors~~.
      • [ ] ~~Fixed some broken URLs~~.

        The relative URLs do not render properly on PyPI. Converted them from relative to absolute URLs. image


    Closes #24

    ready to merge 
    opened by sugatoray 13
  • Comparing ETS with Statsmodels ExponentialSmoothing

    Comparing ETS with Statsmodels ExponentialSmoothing

    Hi guys,

    I was playing a little with ETS to see whether we could include it in Darts. For the time being I'm having a hard time to have it outperform statsmodels in terms of runtime (I haven't looked at accuracy). Is there any special trick (or a special regime to be in) in order for the statsforcast version to run faster?

    Here is the small benchmark I ran:

    import numpy as np
    import time
    
    from statsforecast.models import ETS as SF_ETS
    import statsmodels.tsa.holtwinters as hw
    
    values = np.random.rand(100,)
    
    # First, we make a dry run for jit:
    model_sf = SF_ETS()
    model_sf.fit(values)
    _ = model_sf.predict(10)
    
    model_sm = hw.ExponentialSmoothing(values)
    sm_res = model_sm.fit()
    _ = sm_res.forecast(10)
    
    # Now, a small benchmark:
    tic = time.time()
    for _ in range(100):
        model_sf = SF_ETS()
        model_sf.fit(values)
        _ = model_sf.predict(10)
        
    print("Time taken by SF ETS : {:.2f} s.".format(time.time() - tic))
    
    tic = time.time()
    for _ in range(100):
        model_sm = hw.ExponentialSmoothing(values)
        sm_res = model_sm.fit()
        _ = sm_res.forecast(10)
        
    print("Time taken by SM ETS : {:.2f} s.".format(time.time() - tic))
    

    And I get results like

    Time taken by SF ETS : 1.54 s.
    Time taken by SM ETS : 0.38 s.
    
    opened by hrzn 12
  • :sparkles: add plotly-resampler as plotting engine

    :sparkles: add plotly-resampler as plotting engine

    This PR adds "plotly-resampler" as option for the plotting engine of StatsForecast.plot method, see #342

    Would love to hear your feedback on this!


    Example use of plotly-resampler (in the ElectricityLoadForecasting.ipynb notetbook)

    image

    Note that plotly-resampler properties can be passed in the resampler_kwargs argument. :arrow_down: illustrates how for example the number of shown samples can be changed!

    StatsForecast.plot(df, engine="plotly-resampler", resampler_kwargs={"default_n_shown_samples": 3000, "show_dash": {"mode": "inline"}})
    

    PS: also fixed a minor typo in the CONTRIBUTING.md file

    opened by jvdd 9
  • fix: make logging config local to package

    fix: make logging config local to package

    The core notebook created a logging config when the notebook was ran, but also when the generated core.py file was imported. This prevented anyone who imports statsforecast from setting their own logging config in the most commonly used way.

    More details in the linked issue. Fixes Nixtla/statsforecast#275

    opened by JeroenPeterBos 7
  • "Frequency too high" for anything finer than monthly, but I don't have enough data to sample monthly

    Describe the bug I'm trying to follow the "Getting started" example with my own data, which happens to be sampled hourly. So I set season_length = 365 * 24, but got the following error:

    ...
    
    File /opt/conda/lib/python3.9/site-packages/statsforecast/models.py:245, in ETS.forecast(self, y, h, X, X_future, fitted)
        237 def forecast(
        238         self, 
        239         y: np.ndarray, # time series
       (...)
        243         fitted: bool = False, # return fitted values?
        244     ):
    --> 245     mod = ets_f(y, m=self.season_length, model=self.model)
        246     fcst = forecast_ets(mod, h)
        247     keys = ['mean']
    
    File /opt/conda/lib/python3.9/site-packages/statsforecast/ets.py:937, in ets_f(y, m, model, damped, alpha, beta, gamma, phi, additive_only, blambda, biasadj, lower, upper, opt_crit, nmse, bounds, ic, restrict, allow_multiplicative_trend, use_initial_values, maxit)
        935 if m > 24:
        936     if seasontype in ['A', 'M']:
    --> 937         raise ValueError('Frequency too high')
        938     elif seasontype == 'Z':
        939         warnings.warn(
        940             "I can't handle data with frequency greater than 24. " 
        941             "Seasonality will be ignored."
        942         )
    
    ValueError: Frequency too high
    

    If I resample daily and set season_length = 365, I too get a ValueError: Frequency too high. Same goes with a weekly resample and season_length = 52.

    I only have 2 incomplete years of data, so I can't resample monthly: I get various versions of

    ValueError: order must be 3 non-negative integers, got (0, 0, 0)
    

    To Reproduce (I can work on a reproducer if it helps, it will take me some time)

    Expected behavior The models work with higher frequencies.

    Desktop (please complete the following information):

    • OS: Ubuntu 22.04
    • Browser: Firefox
    • Version: 1.0.0
    opened by astrojuanlu 6
  • Stability of the API

    Stability of the API

    Hi, great work on statsforecasts! This package looks very nice. I'm considering integrating a couple of the models in Darts (https://github.com/unit8co/darts). I'm wondering about your future plans - do you intend to maintain this package on the long term? How likely can we expect API changes in the future releases?

    Also as a side note - I took a quick look at the Croston method, and it looks like the method accepts h and future_xreg, which I'm not sure is intended as those are not used.

    In general I think slightly more documentations on your different models could be helpful for users :)

    question 
    opened by hrzn 6
  • Compute residuals

    Compute residuals

    I'm currently trying to perform some forecastings on a set of daily time series and I was wondering whether is there a way to get the predictions on the training data, that are used to compute the residuals (difference between actual and predictions in the train). In StatsForecast class there is no possibility for doing that. I'm mainly interested to obtain them with auto_arima approach, but it could be extended also for the remaining approaches.

    Is it possible to add a method or attribute to get them?

    Thank you

    question arima 
    opened by asinig 6
  • [question] Model summary table for ARIMA model

    [question] Model summary table for ARIMA model

    Hi! I was wondering if you have implemented (or planning to implement) a model summary table for the ARIMA model that contain the coefficients, their p-values, etc.?

    Like https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMAResults.summary.html

    Many thanks!

    enhancement arima 
    opened by darinkist 6
  • [docs] HTML in index notebook messes up online docs

    [docs] HTML in index notebook messes up online docs

    This is how the docs render locally for me with the latest changes to the readme. I believe this is due to the embedded html, maybe we can try to achieve that formatting without using html.

    Screenshot from 2022-02-23 21-25-27

    bug documentation 
    opened by jmoralez 6
  • ValueError: math domain error

    ValueError: math domain error

    I get ValueError: math domain error cause by tmp['bic'] = tmp['aic'] + npar*(math.log(nstar) - 2) from statsforecast/arima.py", line 1225,

    I guess nstar is not > 0

    opened by dalpozz 6
  • Exception: no model able to be fitted Error on AutoARIMA

    Exception: no model able to be fitted Error on AutoARIMA

    I am trying to solve a timeseries problem with intermittent zero demand in the timeframe(Monthly data). I am getting this warning/error.

    /opt/conda/lib/python3.7/site-packages/statsforecast/arima.py:866: RuntimeWarning: divide by zero encountered in log return 0.5 * np.log(res) /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:443: RuntimeWarning: divide by zero encountered in double_scalars l0 = l0 / b0 /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:443: RuntimeWarning: divide by zero encountered in double_scalars l0 = l0 / b0 /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:448: RuntimeWarning: invalid value encountered in float_scalars b0 = max(y_sa[1] / y_sa[0], 1e-3) /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:448: RuntimeWarning: invalid value encountered in float_scalars b0 = max(y_sa[1] / y_sa[0], 1e-3) /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:443: RuntimeWarning: divide by zero encountered in double_scalars l0 = l0 / b0 /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:448: RuntimeWarning: invalid value encountered in float_scalars b0 = max(y_sa[1] / y_sa[0], 1e-3) /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:443: RuntimeWarning: divide by zero encountered in double_scalars l0 = l0 / b0 /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:448: RuntimeWarning: invalid value encountered in float_scalars b0 = max(y_sa[1] / y_sa[0], 1e-3) /opt/conda/lib/python3.7/site-packages/statsforecast/ets.py:448: RuntimeWarning: invalid value encountered in double_scalars b0 = max(y_sa[1] / y_sa[0], 1e-3)

    and throws an error showing

    Exception: no model able to be fitted

    Any thoughts on how this can be resolved? How can I use your package for this?

    Regards Shravan

    bug 
    opened by shravankoninti 5
  • [FugueBackend: Dask Distributed] Result is empty when using remote dask cluster

    [FugueBackend: Dask Distributed] Result is empty when using remote dask cluster

    What happened + What you expected to happen

    I wanted to try distributed computation using dask. I followed the docs https://nixtla.github.io/statsforecast/distributed.fugue.html and it's working fine using dask local client.

    However, when I attempted to use a remote dask cluster, I got empty result:

    from dask.distributed import Client
    from fugue_dask import DaskExecutionEngine
    from statsforecast import StatsForecast
    from statsforecast.models import Naive
    from statsforecast.utils import generate_series
    from statsforecast.distributed.fugue import FugueBackend
    import pandas as pd
    
    # Instantiate FugueBackend with DaskExecutionEngine
    dask_client = Client('tcp://***:***')
    engine = DaskExecutionEngine(dask_client=dask_client)
    remote_backend = FugueBackend(engine=engine, as_local=True)
    Y_df = pd.read_parquet('https://datasets-nixtla.s3.amazonaws.com/m4-hourly.parquet')
    
    
    from statsforecast.models import (
        AutoARIMA,
        HoltWinters,
        CrostonClassic as Croston, 
        HistoricAverage,
        DynamicOptimizedTheta as DOT,
        SeasonalNaive
    )
    
    
    # Create a list of models and instantiation parameters
    models = [
        AutoARIMA(season_length=24),
        HoltWinters(),
        Croston(),
        SeasonalNaive(season_length=24),
        HistoricAverage(),
        DOT(season_length=24)
    ]
    
    # Instantiate StatsForecast class as sf
    sf = StatsForecast(
        df=Y_df, 
        models=models,
        freq='H', 
        n_jobs=-1,
        fallback_model = SeasonalNaive(season_length=7),
        backend=remote_backend
    )
    
    forecasts_df = sf.forecast(h=48, level=[90])
    print(forecasts_df.size) # returns 0
    

    I didn't get any errors, but the result is empty

    Versions / Dependencies

    Local setup:

    OS: mac Python: 3.10 py packages:

    • dask==2022.12.1
    • dask-cloudprovider==2022.10.0
    • datasetsforecast==0.0.7
    • distributed==2022.12.1
    • fugue==0.7.3
    • fugue-sql-antlr==0.1.1
    • s3fs==2022.11.0
    • statsforecast==1.4.0
    • pandas==1.5.2

    Dask cluster:

    • scheduer & worker: dask, version 2022.12.1
    • Python: 3.8

    Reproduction script

    • Create a remote dask cluster
    • Use the code above to connect to dask scheduler, and run the sample

    Issue Severity

    High: It blocks me from completing my task.

    bug 
    opened by ibyter 1
  •  [Core] Make StatsForecast.forecast_fitted_values() possible when using a Fugue backend

    [Core] Make StatsForecast.forecast_fitted_values() possible when using a Fugue backend

    Description

    When using StatsForecast.forecast_fitted_values() while the backend parameter in the StatsForecast object is set to my Fugue backend, I get the following error: NotImplementedError: Execution with a distributed backend only supports forecast and cross_validation methods. Try setting the backend parameter to None.

    Use case

    I would like to use StatsForecast.forecast_fitted_values() with my Fugue backend so I can use the fitted values in reconciliation approaches provided by the hierarchicalforecast package, while training the models for the hierarchy quickly using Spark.

    enhancement 
    opened by wregter 0
  • [AutoETS] add support for exogenous variables

    [AutoETS] add support for exogenous variables

    Description

    Currently AutoARIMA does support exogenous variables. Which is great because it allows for including some external information into your model without losing yourself in complexity or giving up on the effectiveness of 'simple' statistical techniques.

    In addition to this it might prove worth while to add this feature to AutoETS.

    There's this paper in which they implement it: https://www.monash.edu/business/econometrics-and-business-statistics/research/publications/ebs/wp02-15.pdf

    Additionally, in this post Stephen Kolassa suggests a couple of ways of implementing it, among which doing it similarly to AutoARIMA (regression with ARMA errors): https://stats.stackexchange.com/questions/220830/holt-winters-with-exogenous-regressors-in-r

    Hyndman also shares some of his thought on the problem (old post though): https://robjhyndman.com/hyndsight/ets-regressors/

    Maybe, we can get Osman to share his code with us.

    Could be that there's other people implementing it too. But I haven't come across it yet.

    Use case

    No response

    enhancement 
    opened by Beerstabr 0
  • [Core] Make train_ds accessible to models

    [Core] Make train_ds accessible to models

    Description

    Models in statsforecast, mlforecast, and neuraforecast currently can only receive train_y but not train_ds.

    Other models such as uber/orbit can process arbitrary sparse training data (e.g. train_ds=[2010, 2011, 2014, 2016]).

    Use case

    I am trying to build an adapter for the orbit API to be integrated into the nixtla model ecosystem (implement custom adapter class with the fit, predict)

    enhancement 
    opened by Elijas 0
  • [FugueBackend] Forecast with Exogenous variables fails using a Spark backend

    [FugueBackend] Forecast with Exogenous variables fails using a Spark backend

    What happened + What you expected to happen

    When trying to use the FugueBackend class to distribute my exogenous forecasts, I encounter the following error

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <command-2016733390302825> in <cell line: 13>()
         11 spark = SparkSession.builder.getOrCreate()
         12 backend = FugueBackend(spark, {"fugue.spark.use_pandas_udf":True})
    ---> 13 forecasts = forecast(
         14     spark.createDataFrame(df),
         15     models=[ETS()],
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/statsforecast/distributed/utils.py in forecast(df, models, freq, h, X_df, level, parallel)
         20 ):
         21     backend = parallel if parallel is not None else ParallelBackend()
    ---> 22     return backend.forecast(df, models, freq, h=h, X_df=X_df, level=level)
         23 
         24 # %% ../../nbs/distributed.utils.ipynb 6
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/statsforecast/distributed/fugue.py in forecast(self, df, models, freq, **kwargs)
         81         schema = schema + ",AutoARIMA_lo_99:float, AutoARIMA_hi_99:float"
         82         print(schema)
    ---> 83         return transform(
         84             df,
         85             self._forecast_series,
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fugue/interfaceless.py in transform(df, using, schema, params, partition, callback, ignore_errors, engine, engine_conf, force_output_fugue_dataframe, persist, as_local, save_path, checkpoint)
        134     else:
        135         src = dag.df(df)
    --> 136     tdf = src.transform(
        137         using=using,
        138         schema=schema,
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fugue/workflow/workflow.py in transform(self, using, schema, params, pre_partition, ignore_errors, callback)
        539         if pre_partition is None:
        540             pre_partition = self.partition_spec
    --> 541         df = self.workflow.transform(
        542             self,
        543             using=using,
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fugue/workflow/workflow.py in transform(self, using, schema, params, pre_partition, ignore_errors, callback, *dfs)
       1954         tf._has_rpc_client = not isinstance(callback, EmptyRPCHandler)  # type: ignore
       1955         tf.validate_on_compile()
    -> 1956         return self.process(
       1957             *dfs,
       1958             using=RunTransformer,
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fugue/workflow/workflow.py in process(self, using, schema, params, pre_partition, *dfs)
       1615             using = _PROCESSOR_REGISTRY.get(using)
       1616         _dfs = self._to_dfs(*dfs)
    -> 1617         task = Process(
       1618             len(_dfs),
       1619             processor=using,
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fugue/workflow/_tasks.py in __init__(self, input_n, processor, schema, params, pre_partition, deterministic, lazy, input_names)
        314     ):
        315         self._processor = _to_processor(processor, schema)
    --> 316         self._processor._params = ParamDict(params)
        317         self._processor._partition_spec = PartitionSpec(pre_partition)
        318         self._processor.validate_on_compile()
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/triad/collections/dict.py in __init__(self, data, deep)
        175     def __init__(self, data: Any = None, deep: bool = True):
        176         super().__init__()
    --> 177         self.update(data, deep=deep)
        178 
        179     def __setitem__(  # type: ignore
    
    /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/triad/collections/dict.py in update(self, other, on_dup, deep)
        262         for k, v in to_kv_iterable(other):
        263             if on_dup == ParamDict.OVERWRITE or k not in self:
    --> 264                 self[k] = copy.deepcopy(v) if deep else v
        265             elif on_dup == ParamDict.THROW:
        266                 raise KeyError(f"{k} exists in dict")
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        144     copier = _deepcopy_dispatch.get(cls)
        145     if copier is not None:
    --> 146         y = copier(x, memo)
        147     else:
        148         if issubclass(cls, type):
    
    /usr/lib/python3.9/copy.py in _deepcopy_dict(x, memo, deepcopy)
        228     memo[id(x)] = y
        229     for key, value in x.items():
    --> 230         y[deepcopy(key, memo)] = deepcopy(value, memo)
        231     return y
        232 d[dict] = _deepcopy_dict
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        144     copier = _deepcopy_dispatch.get(cls)
        145     if copier is not None:
    --> 146         y = copier(x, memo)
        147     else:
        148         if issubclass(cls, type):
    
    /usr/lib/python3.9/copy.py in _deepcopy_dict(x, memo, deepcopy)
        228     memo[id(x)] = y
        229     for key, value in x.items():
    --> 230         y[deepcopy(key, memo)] = deepcopy(value, memo)
        231     return y
        232 d[dict] = _deepcopy_dict
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        170                     y = x
        171                 else:
    --> 172                     y = _reconstruct(x, memo, *rv)
        173 
        174     # If is its own copy, don't memoize.
    
    /usr/lib/python3.9/copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
        268     if state is not None:
        269         if deep:
    --> 270             state = deepcopy(state, memo)
        271         if hasattr(y, '__setstate__'):
        272             y.__setstate__(state)
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        144     copier = _deepcopy_dispatch.get(cls)
        145     if copier is not None:
    --> 146         y = copier(x, memo)
        147     else:
        148         if issubclass(cls, type):
    
    /usr/lib/python3.9/copy.py in _deepcopy_dict(x, memo, deepcopy)
        228     memo[id(x)] = y
        229     for key, value in x.items():
    --> 230         y[deepcopy(key, memo)] = deepcopy(value, memo)
        231     return y
        232 d[dict] = _deepcopy_dict
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        170                     y = x
        171                 else:
    --> 172                     y = _reconstruct(x, memo, *rv)
        173 
        174     # If is its own copy, don't memoize.
    
    /usr/lib/python3.9/copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
        268     if state is not None:
        269         if deep:
    --> 270             state = deepcopy(state, memo)
        271         if hasattr(y, '__setstate__'):
        272             y.__setstate__(state)
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        144     copier = _deepcopy_dispatch.get(cls)
        145     if copier is not None:
    --> 146         y = copier(x, memo)
        147     else:
        148         if issubclass(cls, type):
    
    /usr/lib/python3.9/copy.py in _deepcopy_dict(x, memo, deepcopy)
        228     memo[id(x)] = y
        229     for key, value in x.items():
    --> 230         y[deepcopy(key, memo)] = deepcopy(value, memo)
        231     return y
        232 d[dict] = _deepcopy_dict
    
    /usr/lib/python3.9/copy.py in deepcopy(x, memo, _nil)
        159                     reductor = getattr(x, "__reduce_ex__", None)
        160                     if reductor is not None:
    --> 161                         rv = reductor(4)
        162                     else:
        163                         reductor = getattr(x, "__reduce__", None)
    
    /databricks/spark/python/pyspark/context.py in __getnewargs__(self)
        493     def __getnewargs__(self) -> NoReturn:
        494         # This method is called when attempting to pickle SparkContext, which is always an error:
    --> 495         raise RuntimeError(
        496             "It appears that you are attempting to reference SparkContext from a broadcast "
        497             "variable, action, or transformation. SparkContext can only be used on the driver, "
    
    RuntimeError: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
    

    Versions / Dependencies

    statsforecast: 1.0.0 fugue: 0.7.3 python: 3.9 OS: Unbuntu

    Reproduction script

    from statsforecast.distributed.utils import forecast
    from statsforecast.distributed.fugue import FugueBackend
    from statsforecast.models import ETS
    from statsforecast.core import StatsForecast
    
    from pyspark.sql import SparkSession
    df = pd.DataFrame({"ds": [1, 2, 3,4,5,6,7,8,9], "y": [1,2,3,4,5,6,7,8,9], "x":[1,2,3,4,5,6,7,8,9]})
    df["unique_id"] = 1
    
    X_df = pd.DataFrame({"ds":[4], "x":[4], "unique_id":1})
    spark = SparkSession.builder.getOrCreate()
    backend = FugueBackend(spark, {"fugue.spark.use_pandas_udf":True})
    forecasts = forecast(
        spark.createDataFrame(df),
        models=[ETS()],
        X_df=spark.createDataFrame(X_df),
        h=1,
        freq="D",
        parallel=backend)
    

    Issue Severity

    High: It blocks me from completing my task.

    bug 
    opened by jstammers 0
  • AutoARIMA, AutoETS' documentation missing `ic` options (`aic`, `bic`, `aicc`).

    AutoARIMA, AutoETS' documentation missing `ic` options (`aic`, `bic`, `aicc`).

    Description

    Hi! I'm new to this library. I cannot find the possible values for each parameter. For example, in https://nixtla.github.io/statsforecast/models.html#autoarima, I can see the defaults, but if I wanted to modify the ic parameter from 'aicc' to some other value, I don't know what that other value that would be. Is it possible to expose all the possible values for each parameter. Thanks, Victor

    Link

    No response

    documentation 
    opened by victor-guerrero 1
Releases(v1.4.0)
  • v1.4.0(Dec 1, 2022)

    What's Changed

    • feat: Added prediction intervals for insample and ETS models in https://github.com/Nixtla/statsforecast/pull/328
    • [FEAT] Add plot anomalies option in https://github.com/Nixtla/statsforecast/pull/341
    • [DOCS] Improve README and docs page index in https://github.com/Nixtla/statsforecast/pull/344

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.3.2...v1.4.0

    Source code(tar.gz)
    Source code(zip)
  • v1.3.2(Nov 28, 2022)

    What's Changed

    • [FIX] Improvements to StatsForecast's plot method in https://github.com/Nixtla/statsforecast/pull/312
    • [FEAT] Add plotly as engine to StatsForecast's plot method in https://github.com/Nixtla/statsforecast/pull/313
    • [FEAT] Add autowidth to plotly engine in https://github.com/Nixtla/statsforecast/pull/314
    • feat: add new documentation in https://github.com/Nixtla/statsforecast/pull/317
    • [FIX] ETS for inttermitent series in https://github.com/Nixtla/statsforecast/pull/315
    • [FIX] Theta for intermittent series in https://github.com/Nixtla/statsforecast/pull/316
    • [FEAT] Rename ETS to AutoETS in https://github.com/Nixtla/statsforecast/pull/318
    • [FEAT] Change library to newest black formatting in https://github.com/Nixtla/statsforecast/pull/320
    • [FIX] Add new plot method to mstl example in https://github.com/Nixtla/statsforecast/pull/324
    • [FIX] Build docs for Theta model in https://github.com/Nixtla/statsforecast/pull/322
    • [FIX] Isolate elements for all subplots plotly in https://github.com/Nixtla/statsforecast/pull/323
    • Fix/multiple seas docs in https://github.com/Nixtla/statsforecast/pull/325
    • [FEAT] Add mstl experiment in https://github.com/Nixtla/statsforecast/pull/326
    • [FIX] Prevent futurewarning series indexing in https://github.com/Nixtla/statsforecast/pull/327
    • Fix sidebar in https://github.com/Nixtla/statsforecast/pull/331
    • feat: Improved tutorial on Cross-Validation in https://github.com/Nixtla/statsforecast/pull/333
    • Feat/improve prediction intervals in https://github.com/Nixtla/statsforecast/pull/336
    • fix: Improved AutoARIMA plot in https://github.com/Nixtla/statsforecast/pull/334
    • docs: ERCOT electricity demand peak forecasting in https://github.com/Nixtla/statsforecast/pull/335
    • docs: fix peak demand plot in https://github.com/Nixtla/statsforecast/pull/339

    New Contributors

    • @cchallu made their first contribution in https://github.com/Nixtla/statsforecast/pull/335

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.3.1...v1.3.2

    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Nov 17, 2022)

    What's Changed

    • [FEAT] Add plot method to StatsForecast class in https://github.com/Nixtla/statsforecast/pull/305
    • [FEAT] New Issues Templates in https://github.com/Nixtla/statsforecast/pull/307
    • [FIX] make logging config local to package in https://github.com/Nixtla/statsforecast/pull/275
    • [FIX] Error when ds column is object in https://github.com/Nixtla/statsforecast/pull/309

    New Contributors

    • @JeroenPeterBos made their first contribution in https://github.com/Nixtla/statsforecast/pull/275

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.3.0...v1.3.1

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Nov 15, 2022)

    What's Changed

    • [FIX] Use conda env for ray tests in https://github.com/Nixtla/statsforecast/pull/297
    • [FIX] Source code broken links in https://github.com/Nixtla/statsforecast/pull/293
    • [FIX] Sparse models with zero-valued time series in https://github.com/Nixtla/statsforecast/pull/294
    • [FIX] Add explicit optional argument (PEP-484) in https://github.com/Nixtla/statsforecast/pull/301
    • [FIX] SeasonalNaive in https://github.com/Nixtla/statsforecast/pull/302
    • [FEAT] Add exogenous variables to fugue's backend in https://github.com/Nixtla/statsforecast/pull/300
    • [FEAT] Add Theta methods in https://github.com/Nixtla/statsforecast/pull/299
    • [FEAT] Add MSTL example and comparison in https://github.com/Nixtla/statsforecast/pull/295
    • [FEAT] Add backend argument to StatsForecast class in https://github.com/Nixtla/statsforecast/pull/303

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.2.1...v1.3.0

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Nov 2, 2022)

    What's Changed

    • [FEAT]: Add fallback model to cross validation in https://github.com/Nixtla/statsforecast/pull/289

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.2.0...v1.2.1

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Oct 31, 2022)

    What's Changed

    • [FEAT] MSTL model n https://github.com/Nixtla/statsforecast/pull/284

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.1.3...v1.2.0

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Oct 25, 2022)

    What's Changed

    • [FEAT] Add progress bar for sequential tasks in https://github.com/Nixtla/statsforecast/pull/280

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.1.2...v1.1.3

    Source code(tar.gz)
    Source code(zip)
  • v1.1.2(Oct 24, 2022)

    What's Changed

    • [FEAT] Improve navbar docs in https://github.com/Nixtla/statsforecast/pull/262
    • [FEAT] Add ETS to spark results in https://github.com/Nixtla/statsforecast/pull/264
    • [FEAT] Improve CES results in https://github.com/Nixtla/statsforecast/pull/265
    • [FEAT] Add fallback model to distributed backends in https://github.com/Nixtla/statsforecast/pull/277
    • [FIX] Backend docs in https://github.com/Nixtla/statsforecast/pull/278

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.1.1...v1.1.2

    Source code(tar.gz)
    Source code(zip)
  • v1.1.1(Oct 5, 2022)

    What's Changed

    • [FEAT] Add Distributed post in https://github.com/Nixtla/statsforecast/pull/257
    • [FEAT] Fallback Model in https://github.com/Nixtla/statsforecast/pull/259

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.1.0...v1.1.1

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Sep 28, 2022)

    What's Changed

    • [FIX] License in https://github.com/Nixtla/statsforecast/pull/191
    • [FIX] Add hide statement for ets cells in https://github.com/Nixtla/statsforecast/pull/192
    • [FEAT] New experiments neuralprophet in https://github.com/Nixtla/statsforecast/pull/195
    • [FIX] use ubuntu to deploy docs in https://github.com/Nixtla/statsforecast/pull/197
    • [FIX] Broken links in https://github.com/Nixtla/statsforecast/pull/203
    • [FEAT] Add linters and update contributing instructions in https://github.com/Nixtla/statsforecast/pull/205
    • [FIX] nbdev latest changes in https://github.com/Nixtla/statsforecast/pull/208
    • [FIX] python3.7 ci error in https://github.com/Nixtla/statsforecast/pull/214
    • fixing the argument name for external regressors in the example notebook in https://github.com/Nixtla/statsforecast/pull/200
    • [FIX] #210 in https://github.com/Nixtla/statsforecast/pull/213
    • Docstring based documentation in https://github.com/Nixtla/statsforecast/pull/209
    • [FIX] nbdev version until next release in https://github.com/Nixtla/statsforecast/pull/225
    • [FEAT] Prediction intervals for fitted values in https://github.com/Nixtla/statsforecast/pull/228
    • [FEAT] Add anomaly detection example in https://github.com/Nixtla/statsforecast/pull/229
    • [FEAT] Add single anomaly plot in https://github.com/Nixtla/statsforecast/pull/230
    • [FEAT] Add exogenous var use case and install instructions in https://github.com/Nixtla/statsforecast/pull/231
    • [FEAT] M5 scalability comparison in https://github.com/Nixtla/statsforecast/pull/232
    • Intervals for some simple methods in https://github.com/Nixtla/statsforecast/pull/201
    • [FEAT] Add prediction intervals example in https://github.com/Nixtla/statsforecast/pull/239
    • [FEAT] Auto CES model by in https://github.com/Nixtla/statsforecast/pull/238
    • [FIX] nbdev releases in https://github.com/Nixtla/statsforecast/pull/251
    • [FEAT] Add CES + ETS ensemble results in https://github.com/Nixtla/statsforecast/pull/252
    • [FIX] nbdev deploy to gihub pages in https://github.com/Nixtla/statsforecast/pull/253

    New Contributors

    • @jattenberg made their first contribution in https://github.com/Nixtla/statsforecast/pull/200

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v1.0.0...v1.1.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Aug 15, 2022)

    What's Changed

    • Add FugueBackend in https://github.com/Nixtla/statsforecast/pull/157
    • [FEAT] Add neuralprophet experiment in https://github.com/Nixtla/statsforecast/pull/181
    • [FEAT] nbdev2 integration in https://github.com/Nixtla/statsforecast/pull/186
    • [BREAKING CHANGE] SKLearn syntax in https://github.com/Nixtla/statsforecast/pull/184

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.7.1...v1.0.0

    Source code(tar.gz)
    Source code(zip)
  • v0.7.1(Jul 23, 2022)

    What's Changed

    • [FEAT] Fitted df returns in-sample values in https://github.com/Nixtla/statsforecast/pull/158

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.7.0...v0.7.1

    Source code(tar.gz)
    Source code(zip)
  • v0.7.0(Jul 21, 2022)

    What's Changed

    • [Fix]: prevent arima RuntimeWarnings in https://github.com/Nixtla/statsforecast/pull/136
    • [BREAKING CHANGE] Fitted Values Computation in https://github.com/Nixtla/statsforecast/pull/137
    • Now models return a dict instead a numpy array with mean and fitted values.

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.6.0...v0.7.0

    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jul 19, 2022)

    What's Changed

    • [FEAT] Add ETS model and experiments in https://github.com/Nixtla/statsforecast/pull/142
    • [BREAKING CHANGE] Deprecate python3.6 in https://github.com/Nixtla/statsforecast/pull/146
    • [FEAT] Ray experiment ets in https://github.com/Nixtla/statsforecast/pull/145
    • [FEAT] Readme updates to include ets in https://github.com/Nixtla/statsforecast/pull/148

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.5.6...v0.6.0

    Source code(tar.gz)
    Source code(zip)
  • v0.5.6(Jun 27, 2022)

    What's Changed

    • [DOCS] Typo fixes by @ryanrussell in https://github.com/Nixtla/statsforecast/pull/117
    • [FEAT] Add fugue example by @goodwanghan in https://github.com/Nixtla/statsforecast/pull/111
    • [FEAT] add cross validation functionality by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/120
    • [FIX] #121 fitting autoarima on constant time series causes typeerror by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/122
    • [DOCS] add shagn as a contributor for bug by @allcontributors in https://github.com/Nixtla/statsforecast/pull/124
    • [FEAT] add integer ds compatibility for cross validation by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/123
    • [FEAT] Add n_windows argument for cross_validation method by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/131
    • [EXP] Add benchmarks at scale experiment by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/134

    New Contributors

    • @ryanrussell made their first contribution in https://github.com/Nixtla/statsforecast/pull/117
    • @goodwanghan made their first contribution in https://github.com/Nixtla/statsforecast/pull/111

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.5.5...v0.5.6

    Source code(tar.gz)
    Source code(zip)
  • v0.5.5(May 9, 2022)

    What's Changed

    • ARIMA level/quantile compatibility, missing nbdev_flow, protected gif by @kdgutier in https://github.com/Nixtla/statsforecast/pull/102
    • Add dependency hint for quick intro by @guerda in https://github.com/Nixtla/statsforecast/pull/106
    • [FEAT] Add AutoARIMA adapter for Prophet by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/114

    New Contributors

    • @kdgutier made their first contribution in https://github.com/Nixtla/statsforecast/pull/102
    • @guerda made their first contribution in https://github.com/Nixtla/statsforecast/pull/106

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.5.4...v0.5.5

    Source code(tar.gz)
    Source code(zip)
  • v0.5.4(May 2, 2022)

    What's Changed

    • feat: add issues template by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/93
    • refactor: use Pool instead of ProcessPoolExecutor by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/96
    • Feat: add ray integration by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/98
    • fix: add automatic n_jobs behavior by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/99
    • Creation of forecast dates improvement by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/101
    • Ray experiment by @FedericoGarza in https://github.com/Nixtla/statsforecast/pull/103
    • Update README.md by @mergenthaler in https://github.com/Nixtla/statsforecast/pull/104

    Full Changelog: https://github.com/Nixtla/statsforecast/compare/v0.5.3...v0.5.4

    Source code(tar.gz)
    Source code(zip)
  • v0.5.3(Apr 12, 2022)

    What's Changed

    New features

    • summary method for the AutoARIMA class requested in #31.
    • representational string for the AutoARIMA fitted model, requested in #83.

    Bug Fixes

    • [BUG] croston_sba #88 fixed in #89.
    Source code(tar.gz)
    Source code(zip)
  • v0.5.2(Mar 19, 2022)

  • v0.5.1(Mar 11, 2022)

  • v0.5.0(Mar 7, 2022)

    Notable changes

    • Inclusion of prediction intervals for auto_arima.
    • statsforecast is now installable from conda-forge (conda install -c conda-forecast statsforecast, thanks to @sugatoray).
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Mar 1, 2022)

    Notable changes

    • Inclusion of exogenous variables for auto_arima.
    • The StatsForecast class now handles exogenous variables.
    • This release allows developers to include more models that use exogenous variables.
    • Bug fixes.
    Source code(tar.gz)
    Source code(zip)
Owner
Nixtla
Open Source Time Series Forecasting
Nixtla
This handbook accompanies the course: Machine Learning with Hung-Yi Lee

This handbook accompanies the course: Machine Learning with Hung-Yi Lee

RenChu Wang 472 Dec 31, 2022
My capstone project for Udacity's Machine Learning Nanodegree

MLND-Capstone My capstone project for Udacity's Machine Learning Nanodegree Lane Detection with Deep Learning In this project, I use a deep learning-b

Michael Virgo 407 Dec 12, 2022
LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms

LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms Based on the work by Smith et al. (2021) Query

5 Aug 06, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Dec 29, 2022
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah) Xue 0 Jan 31, 2022
A Python implementation of FastDTW

fastdtw Python implementation of FastDTW [1], which is an approximate Dynamic Time Warping (DTW) algorithm that provides optimal or near-optimal align

tanitter 651 Jan 04, 2023
A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Demand-Forecasting Business Problem A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Ayşe Nur Türkaslan 3 Mar 06, 2022
A simple python program which predicts the success of a movie based on it's type, actor, actress and director

Movie-Success-Prediction A simple python program which predicts the success of a movie based on it's type, actor, actress and director. The program us

Mahalinga Prasad R N 1 Dec 17, 2021
Simplify stop motion animation with machine learning.

Simplify stop motion animation with machine learning.

Nick Bild 25 Sep 15, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 09, 2022
XGBoost + Optuna

AutoXGB XGBoost + Optuna: no brainer auto train xgboost directly from CSV files auto tune xgboost using optuna auto serve best xgboot model using fast

abhishek thakur 517 Dec 31, 2022
Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Toolkit for Building Robust ML models that generalize to unseen domains (RobustDG) Divyat Mahajan, Shruti Tople, Amit Sharma Privacy & Causal Learning

Microsoft 149 Jan 06, 2023
A Pythonic framework for threat modeling

pytm: A Pythonic framework for threat modeling Introduction Traditional threat modeling too often comes late to the party, or sometimes not at all. In

Izar Tarandach 644 Dec 20, 2022
Pandas-method-chaining is a plugin for flake8 that provides method chaining linting for pandas code

pandas-method-chaining pandas-method-chaining is a plugin for flake8 that provides method chaining linting for pandas code. It is a fork from pandas-v

Francis 5 May 14, 2022
A simple and lightweight genetic algorithm for optimization of any machine learning model

geneticml This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model. Installation Use pip to ins

Allan Barcelos 8 Aug 10, 2022
GroundSeg Clustering Optimized Kdtree

ground seg and clustering based on kitti velodyne data, and a additional optimized kdtree for knn and radius nn search

2 Dec 02, 2021
stability-selection - A scikit-learn compatible implementation of stability selection

stability-selection - A scikit-learn compatible implementation of stability selection stability-selection is a Python implementation of the stability

185 Dec 03, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
A benchmark of data-centric tasks from across the machine learning lifecycle.

A benchmark of data-centric tasks from across the machine learning lifecycle.

61 Dec 28, 2022
Python based GBDT implementation

Py-boost: a research tool for exploring GBDTs Modern gradient boosting toolkits are very complex and are written in low-level programming languages. A

Sberbank AI Lab 20 Sep 21, 2022