Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Last update: Dec 12, 2022

Related tags

Deep Learning data-analysis-speedup

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Material for my talk at the PyConDE & PyData Berlin 2022

Description

Your data analysis pipeline works. Nice.
Could it be faster? Probably.
Do you need to parallelize? Not yet.

We'll go through optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs. This walkthrough shows tools and strategies to identify and mitigate bottlenecks, and demonstrate them in an example. The 5 steps cover:

Identifying bottlenecks: Profiling
Efficient IO
Vectorization
Memory & Precision Tradeoffs
Jit-ting with numba

This talk is suited for data scientists on a beginner and intermediate level, typically working with a numpy/scipy/… stack or similar. The talk gives strategies & concrete suggestions how to speed up an existing analysis pipeline, which is demonstrated practically on an example, showing the gained speed improvements of each step.

Installation & Usage

python3 -m pip install poetry
poetry install
poetry run python -m jupyterlab

Dev

./format.sh

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Related tags

Overview

5 Steps to Speed Up Your Data-Analysis on a Single Core

Description

Installation & Usage

Dev

Owner

Jonathan Striebel

A real-time speech emotion recognition application using Scikit-learn and gradio

Sleep staging from ECG, assisted with EEG

Analyzing basic network responses to novel classes

Code for the paper: Fighting Fake News: Image Splice Detection via Learned Self-Consistency

Implementation for "Conditional entropy minimization principle for learning domain invariant representation features"

a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

auto-tuning momentum SGD optimizer

Code repository for the paper Computer Vision User Entity Behavior Analytics

Simple, but essential Bayesian optimization package

Semiconductor Machine learning project

OpenVisionAPI server

A python library for self-supervised learning on images.

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

Fast Soft Color Segmentation

CNN visualization tool in TensorFlow

A general-purpose programming language, focused on simplicity, safety and stability.

SMPL-X: A new joint 3D model of the human body, face and hands together