A small timeseries transformation API built on Flask and Pandas

Last update: Mar 25, 2022

Related tags

Data Visualization mcflyin

Overview

#Mcflyin

###A timeseries transformation API built on Pandas and Flask

This is a small demo of an API to do timeseries transformations built on Flask and Pandas.

Concept

The idea is that you can make a POST request to the API with a simple list/array of timestamps, from any language, and get back some interesting transformations of that data.

Why?

Partly to show how straightforward it is to build such a thing. Python is great because it has very powerful, intuitive, quick-to-learn tools for both building web applications and doing data analysis/statistics.

That puts Python in kind of a unique position: powerful web tools, powerful scientific/numerical/statistical data tools. This API is a very simple example of how you can take advantage of both. Go read the source code- it's short and easy to grok. Bug fixes and pull requests welcome.

Getting Started

First we need to find some data. We're going to use some data that Wes McKinney provided in a recent blog post, with some statistics on Python posts on Stack Overflow. This is something of a contrived example: I'm manipulating the data in Python, sending to a Python backend, and then getting a response to manipulate in Python. Just know that all you need is an array of timestamp strings, no matter your language.

import pandas as pd

data = pd.read_csv('AllPandas.csv')
data = data['CreationDate'].tolist()

A simple array of timestamps:

>>>data[:10]
['2011-04-01 14:50:44',
 '2012-01-18 19:41:27',
 '2012-01-23 03:21:00',
 '2012-01-24 17:59:53',
 '2012-03-04 16:58:45',
 '2012-03-09 22:36:52',
 '2012-03-10 15:35:26',
 '2012-03-18 12:53:06',
 '2012-03-30 13:58:29',
 '2012-04-04 23:17:23']

With the McFlyin application running on localhost, lets make a request to resample the data on an daily basis, to get the number of posts per day:

import requests
import json

freq = {'D': 'Daily'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data)}
r = requests.post('http://127.0.0.1:5000/resample', data=sends)
response = r.json

The response is simple JSON:

{'Monthly': {'data': [1.0, 2.0, 1.0, 1.0,...
             'time': ['2011-03-31T00:00:00', '2011-04-30T00:00:00', '2011-05-31T00:00:00', '2011-06-30T00:00:00', '2011-07-31T00:00:00',...

Here's the distribution of daily questions on Stack Overflow for Pandas (monthly probably would have been a little more informative):

Let's call Mcflyin for a rolling sum on a seven-day window. It will resample to the given freq, then apply the window to the result:

freq = {'D': 'Weekly Rolling'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data), 'window': 7}
r = requests.post('http://127.0.0.1:5000/rolling_sum', data=sends)
response = r.json

Let's look at the total questions asked by day:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

and daily means:

sends = {'data': json.dumps(data), 'how': json.dumps('mean')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

The same for hourly:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/hourly', data=sends)
response = r.json

Finally, we can look at hourly by day-of-week:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily_hours', data=sends)
response = r.json

Live demo here

Dependencies

Pandas, Numpy, Requests, Flask

How did you make those colorful graphs?

Vincent and Bearcart

Status

Lots of stuff that could be better- error handling on the requests, probably better handling of weird timestamps, etc. This is just a small demo of how powerful Python can be for building a statistics backend with relatively few lines of code.

If I want to write a front-end in a different language, can I put it in the examples folder?

Yes! PR's welcome.

A small timeseries transformation API built on Flask and Pandas

Related tags

Overview

Concept

Why?

Getting Started

Dependencies

How did you make those colorful graphs?

Status

If I want to write a front-end in a different language, can I put it in the examples folder?

Owner

Rob Story

Plotting data from the landroid and a raspberry pi zero to a influx-db

This is a learning tool and exploration app made using the Dash interactive Python framework developed by Plotly

Visualization of the World Religion Data dataset by Correlates of War Project.

Collection of scripts for making high quality beautiful math-related posters.

Here are my graphs for hw_02

eoplatform is a Python package that aims to simplify Remote Sensing Earth Observation by providing actionable information on a wide swath of RS platforms and provide a simple API for downloading and visualizing RS imagery

Tidy data structures, summaries, and visualisations for missing data

A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations

Generate SVG (dark/light) images visualizing (private/public) GitHub repo statistics for profile/website.

A simple python script using Numpy and Matplotlib library to plot a Mohr's Circle when given a two-dimensional state of stress.

Custom ROI in Computer Vision Applications

A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe which runs both payloads.

High performance, editable, stylable datagrids in jupyter and jupyterlab

Jupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.

Python package for the analysis and visualisation of finite-difference fields.

Data Visualizer for Super Mario Kart (SNES)

Create 3d loss surface visualizations, with optimizer path. Issues welcome!

阴阳师后台全平台（使用网易 MuMu 模拟器）辅助。支持御魂，觉醒，御灵，结界突破，秘闻副本，地域鬼王。

Create SVG drawings from vector geodata files (SHP, geojson, etc).

Python scripts for plotting audiograms and related data from Interacoustics Equinox audiometer and Otoaccess software.