Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Last update: Sep 23, 2021

Overview

play-with-torch

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Tools

Tested Hardware

RasberryPi 4 Model B here, RAM: 4 GB and Processor 4-core @ 1.5 GHz
microSD Card 64 GB
5M USB Retractable Clip 120 Degrees WebCam Web Wide-angle Camera Laptop U7 Mini or Raspi Camera

Tested Software

Ubuntu Desktop 20.10 aarch64 64 bit, install on RasberriPi 4
PyTorch: torch 1.6.0 aarch64 and torchvision 0.7.0 aarch64
Python min. ver. 3.6 (3.8 recommended)

Install the prerequisites

Install packages

$ sudo apt install build-essential make cmake git python3-pip libatlas-base-dev
$ sudo apt install libssl-dev
$ sudo apt install libopenblas-dev libblas-dev m4 python3-yaml
$ sudo apt install libomp-dev

make swap space to 2048 MB

$ free -h
$ sudo swapoff -a
$ sudo dd if=/dev/zero of=/swapfile bs=1M count=2048
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ free -h

Install torch 1.6.0

$ pip3 install torch-1.6.0a0+b31f58d-cp38-cp38-linux_aarch64.whl

Folder Structure

play-with-torch/
├── config/
│    ├── config.json - holds configuration for training
│    └── parse_config.py - class to handle config file and cli options
│
├── docker/
│   ├── Dockerfile
│   └── requirements.txt
│
├── data/ - default directory for storing input data
│
├── docs/ - for documentation
│   └── play-with-torch.tex
│
├── models/ - models, losses, and metrics
│   ├── model.py
│   ├── metric.py
│   └── loss.py
│
├── samples/
│
├── saved/
│   ├── checkpoints/
│   ├── traced_model/
│   ├── models/ - trained models are saved here
│   └── logs/ - default logdir for tensorboard and logging output
│
├── site
├── templates/ - for serving model on Flask
│   └── index.html
├── tests/
├── utils/ - small utility functions
│   ├── data/
│   └── ...
│
├── inference.py - main script to inference model
├── README.md
├── trace_model.py - main script to convert model
└── train.py - main script to start training

Usage

Run inference

$ git clone https://github.com/mheriyanto/play-with-torch.git
$ cd play-with-torch/
$ python3 inference.py video --config config/nanodet-m.yml --model saved/models/nanodet_m.ckpt --path video.mp4

Convert model

$ python3 trace_model.py --cfg_path config/nanodet-m.yml --model_path saved/models/nanodet_m.ckpt --input_shape 320,320

Training

$ python3 train.py config/nanodet_custom_xml_dataset.yml

TO DO

Implement Unit-Test: Test-Driven Development (TDD)

Credit to

Share PyTorch binaries built for Raspberry Pi

Reference

NanoDet: Super fast and lightweight anchor-free object detection model. here
Yunjey Choi - PyTorch Tutorial for Deep Learning Researchers here
Victor Huang - PyTorch Template Project (here)

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Related tags

Overview

play-with-torch

Tools

Tested Hardware

Tested Software

Install the prerequisites

Folder Structure

Usage

TO DO

Credit to

Reference

Owner

eMHa

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Ocular is a state-of-the-art historical OCR system.

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

GDB python tool to pretty print and debug c++ xtensor containers

Library used to deskew a scanned document

Resizing Canny Countour In Python

【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿，我们会帮你完成一切✨

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

A bot that extract text from images using the Tesseract OCR.

A tool to enhance your old/damaged pictures built using python & opencv.

Text language identification using Wikipedia data

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

OCR-D-compliant page segmentation

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

A Vietnamese personal card OCR website built with Django.

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

The first open-source library that detects the font of a text in a image.

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)