This tool uses Deep Learning to help you draw and write with your hand and webcam.

Last update: Dec 10, 2022

Related tags

Overview

air-drawing 👆

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

Try it online : loicmagne.github.io/air-drawing

Technical Details

This pipeline is made up of two steps: detecting the hand, and predicting the drawing. Both steps are done using Deep Learning.
The handpose detection is performed using MediaPipe toolbox
The drawing prediction part uses only the finger position, not the image. The input is a sequence of 2D points (actually i'm using the speed and acceleration of the finger instead of the position to make the prediction translation-invariant), and the output is a binary classification 'pencil up' or 'pencil down'. I used a simple bidirectionnal LSTM architecture. I made a small dataset myself (~50 samples) which I annotated thanks to tools provided in the python-stuff/data-wrangling/. At first I wanted to make the 'pencil up'/'pencil down' prediction in real-time, i.e. make the predictions at the same time the user draws. However this task was too difficult and I had poor results, which is why I'm now using bidirectionnal LSTM. You can find details of the deep learning pipeline in the jupyter-notebook in python-stuff/deep-learning/
The application is entirely client-side. I deployed the deep learning model by converting the PyTorch model to .onnx, and then using the ONNX Runtime which is very convenient and compatible with a lot of layers.

Going Forward

Overall the pipeline still struggles and needs some improvement. Ideas of amelioration include :

Having a bigger dataset, with more diverse user data.
Process and smooth the finger signal, to be less dependent on camera quality, and to improve model generalization.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Related tags

Overview

air-drawing 👆

Technical Details

Going Forward

Owner

lmagne

上海交通大学全自动抢课脚本，支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

Learning Neural Painters Fast! using PyTorch and Fast.ai

Implementation of Continuous Sparsification, a method for pruning and ticket search in deep networks

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Improving adversarial robustness by a coupling rejection strategy

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

One-line your code easily but still with the fun of doing so!

Code for KHGT model, AAAI2021

wmctrl ported to Python Ctypes

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Progressive Growing of GANs for Improved Quality, Stability, and Variation

smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

Breast Cancer Detection 🔬 ITI "AI_Pro" Graduation Project

Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Code for paper: "Spinning Language Models for Propaganda-As-A-Service"

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)