ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Overview

ForecastGA

A Python tool to forecast GA data using several popular time series models.

Open In Colab

Logo for ForecastGA

About

Welcome to ForecastGA

ForecastGA is a tool that combines a couple of popular libraries, Atspy and googleanalytics, with a few enhancements.

  • The models are made more intuitive to upgrade and add by having the tool logic separate from the model training and prediction.
  • When calling am.forecast_insample(), any kwargs included (e.g. learning_rate) are passed to the train method of the model.
  • Google Analytics profiles are specified by simply passing the URL (e.g. https://analytics.google.com/analytics/web/?authuser=2#/report-home/aXXXXXwXXXXXpXXXXXX).
  • You can provide a data dict with GA config options or a Pandas Series as the input data.
  • Multiple log levels.
  • Auto GPU detection (via Torch).
  • List all available models, with descriptions, by calling forecastga.print_model_info().
  • Google API info can be passed in the data dict or uploaded as a JSON file named identity.json.
  • Created a companion Google Colab notebook to easily run on GPU.
  • A handy plot function for Colab, forecastga.plot_colab(forecast_in, title="Insample Forecast", dark_mode=True) that formats nicely and also handles Dark Mode!

Models Available

  • ARIMA : Automated ARIMA Modelling
  • Prophet : Modeling Multiple Seasonality With Linear or Non-linear Growth
  • ProphetBC : Prophet Model with Box-Cox transform of the data
  • HWAAS : Exponential Smoothing With Additive Trend and Additive Seasonality
  • HWAMS : Exponential Smoothing with Additive Trend and Multiplicative Seasonality
  • NBEATS : Neural basis expansion analysis (now fixed at 20 Epochs)
  • Gluonts : RNN-based Model (now fixed at 20 Epochs)
  • TATS : Seasonal and Trend no Box Cox
  • TBAT : Trend and Box Cox
  • TBATS1 : Trend, Seasonal (one), and Box Cox
  • TBATP1 : TBATS1 but Seasonal Inference is Hardcoded by Periodicity
  • TBATS2 : TBATS1 With Two Seasonal Periods

How To Use

Find Model Info:

forecastga.print_model_info()

Initialize Model:

Google Analytics:
data = { 'client_id': '',
         'client_secret': '',
         'identity': '',
         'ga_start_date': '2018-01-01',
         'ga_end_date': '2019-12-31',
         'ga_metric': 'sessions',
         'ga_segment': 'organic traffic',
         'ga_url': 'https://analytics.google.com/analytics/web/?authuser=2#/report-home/aXXXXXwXXXXXpXXXXXX',
         'omit_values_over': 2000000
        }

model_list = ["TATS", "TBATS1", "TBATP1", "TBATS2", "ARIMA"]
am = forecastga.AutomatedModel(data , model_list=model_list, forecast_len=30 )
Pandas DataFrame:
# CSV with columns: Date and Sessions
df = pd.read_csv('ga_sessions.csv')
df.Date = pd.to_datetime(df.Date)
df = df.set_index("Date")
data = df.Sessions

model_list = ["TATS", "TBATS1", "TBATP1", "TBATS2", "ARIMA"]
am = forecastga.AutomatedModel(data , model_list=model_list, forecast_len=30 )

Forecast Insample:

forecast_in, performance = am.forecast_insample()

Forecast Outsample:

forecast_out = am.forecast_outsample()

Ensemble Performance:

all_ensemble_in, all_ensemble_out, all_performance = am.ensemble(forecast_in, forecast_out)

Pretty Plot in Google Colab

forecastga.plot_colab(forecast_in, title="Insample Forecast", dark_mode=True)

Installation

Windows users may need to manually install the two items below via conda :

  1. conda install pystan
  2. conda install pytorch -c pytorch
  3. !pip install --upgrade git+https://github.com/jroakes/ForecastGA.git

otherwise, pip install --upgrade forecastga

This repo support GPU training. Below are a few libraries that may have to be manually installed to support.

pip install --upgrade mxnet-cu101
pip install --upgrade torch 1.7.0+cu101

Acknowledgements

  1. Majority of forecasting code taken from https://github.com/firmai/atspy and refactored heavily.
  2. Google Analytics based off of: https://github.com/debrouwere/google-analytics
  3. Thanks to richardfergie for the addition of the Prophet Box-Cox model to control negative predictions.

Contribute

The goal of this repo is to grow the list of available models to test. If you would like to contribute one please read on. Feel free to have fun naming your models.

  1. Fork the repo.
  2. In the /src/forecastga/models folder there is a model called template.py. You can use this as a template for creating your new model. All available variables are there. Forecastga ensures each model has the right data and calls only the train and forecast methods for each model. Feel free to add additional methods that your model requires.
  3. Edit the /src/forecastga/models/__init__.py file to add your model's information. Follow the format of the other entries. Forecastga relies on loc to find the model and class to find the class to use.
  4. Edit requirments.txt with any additional libraries needed to run your model. Keep in mind that this repo should support GPU training if available and some libraries have separate GPU-enabled versions.
  5. Issue a pull request.

If you enjoyed this tool consider buying me some beer at: Paypalme

Owner
JR Oakes
Hacker, SEO, NC State fan, co-organizer of Raleigh and RTP Meetups, as well as @sengineland author. Tweets are my own.
JR Oakes
AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures.

AptaMAT Purpose AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures. The method is based on the compa

GEC UTC 3 Nov 03, 2022
Performance analysis of predictive (alpha) stock factors

Alphalens Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open sour

Quantopian, Inc. 2.5k Jan 09, 2023
Data cleaning tools for Business analysis

Datacleaning datacleaning tools for Business analysis This program is made for Vicky's work. You can use it, too. 数据清洗 该数据清洗工具是为了商业分析 这个程序是为了Vicky的工作而

Lin Jian 3 Nov 16, 2021
Import, connect and transform data into Excel

xlwings_query Import, connect and transform data into Excel. Description The concept is to apply data transformations to a main query object. When the

George Karakostas 1 Jan 19, 2022
Important dataframe statistics with a single command

quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone

Sven Eschlbeck 2 Dec 19, 2021
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
Additional tools for particle accelerator data analysis and machine information

PyLHC Tools This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Au

PyLHC 3 Apr 13, 2022
CINECA molecular dynamics tutorial set

High Performance Molecular Dynamics Logging into CINECA's computer systems To logon to the M100 system use the following command from an SSH client ss

J. W. Dell 0 Mar 13, 2022
Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma

Avnika Mehta 1 Jan 27, 2022
TE-dependent analysis (tedana) is a Python library for denoising multi-echo functional magnetic resonance imaging (fMRI) data

tedana: TE Dependent ANAlysis TE-dependent analysis (tedana) is a Python library for denoising multi-echo functional magnetic resonance imaging (fMRI)

136 Dec 22, 2022
Automatic earthquake catalog building workflow: EQTransformer + Siamese EQTransformer + PickNet + REAL + HypoInverse

Automatic regional-scale earthquake catalog building workflow: EQTransformer + Siamese EQTransforme

Xiao Zhuowei 9 Nov 27, 2022
Find exposed data in Azure with this public blob scanner

BlobHunter A tool for scanning Azure blob storage accounts for publicly opened blobs. BlobHunter is a part of "Hunting Azure Blobs Exposes Millions of

CyberArk 250 Jan 03, 2023
Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

Theis Lab 1.4k Jan 05, 2023
Open source platform for Data Science Management automation

Hydrosphere examples This repo contains demo scenarios and pre-trained models to show Hydrosphere capabilities. Data and artifacts management Some mod

hydrosphere.io 6 Aug 10, 2021
BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings.

BinTuner is a cost-efficient auto-tuning framework, which can deliver a near-optimal binary code that reveals much more differences than -Ox settings. it also can assist the binary code analysis rese

BinTuner 42 Dec 16, 2022
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.

PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand

Melwin Varghese P 4 Jun 05, 2022
The Spark Challenge Student Check-In/Out Tracking Script

The Spark Challenge Student Check-In/Out Tracking Script This Python Script uses the Student ID Database to match the entries with the ID Card Swipe a

1 Dec 09, 2021
Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-

International Business Machines 293 Dec 29, 2022
WAL enables programmable waveform analysis.

This repro introcudes the Waveform Analysis Language (WAL). The initial paper on WAL will appear at ASPDAC'22 and can be downloaded here: https://www.

Institute for Complex Systems (ICS), Johannes Kepler University Linz 40 Dec 13, 2022