Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Overview

Hits contributions welcome GitHub contributors GitHub last commit GitHub top language GitHub language count GitHub repo size GitHub code size in bytes LinkedIn

play-with-torch

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Tools

Tested Hardware

  • RasberryPi 4 Model B here, RAM: 4 GB and Processor 4-core @ 1.5 GHz
  • microSD Card 64 GB
  • 5M USB Retractable Clip 120 Degrees WebCam Web Wide-angle Camera Laptop U7 Mini or Raspi Camera

Tested Software

  • Ubuntu Desktop 20.10 aarch64 64 bit, install on RasberriPi 4
  • PyTorch: torch 1.6.0 aarch64 and torchvision 0.7.0 aarch64
  • Python min. ver. 3.6 (3.8 recommended)

Install the prerequisites

  • Install packages
$ sudo apt install build-essential make cmake git python3-pip libatlas-base-dev
$ sudo apt install libssl-dev
$ sudo apt install libopenblas-dev libblas-dev m4 python3-yaml
$ sudo apt install libomp-dev
  • make swap space to 2048 MB
$ free -h
$ sudo swapoff -a
$ sudo dd if=/dev/zero of=/swapfile bs=1M count=2048
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ free -h
  • Install torch 1.6.0
$ pip3 install torch-1.6.0a0+b31f58d-cp38-cp38-linux_aarch64.whl

Folder Structure

play-with-torch/
├── config/
│    ├── config.json - holds configuration for training
│    └── parse_config.py - class to handle config file and cli options
│
├── docker/
│   ├── Dockerfile
│   └── requirements.txt
│
├── data/ - default directory for storing input data
│
├── docs/ - for documentation
│   └── play-with-torch.tex
│
├── models/ - models, losses, and metrics
│   ├── model.py
│   ├── metric.py
│   └── loss.py
│
├── samples/
│
├── saved/
│   ├── checkpoints/
│   ├── traced_model/
│   ├── models/ - trained models are saved here
│   └── logs/ - default logdir for tensorboard and logging output
│
├── site
├── templates/ - for serving model on Flask
│   └── index.html
├── tests/
├── utils/ - small utility functions
│   ├── data/
│   └── ...
│
├── inference.py - main script to inference model
├── README.md
├── trace_model.py - main script to convert model
└── train.py - main script to start training  

Usage

Run inference

$ git clone https://github.com/mheriyanto/play-with-torch.git
$ cd play-with-torch/
$ python3 inference.py video --config config/nanodet-m.yml --model saved/models/nanodet_m.ckpt --path video.mp4

Convert model

$ python3 trace_model.py --cfg_path config/nanodet-m.yml --model_path saved/models/nanodet_m.ckpt --input_shape 320,320

Training

$ python3 train.py config/nanodet_custom_xml_dataset.yml

TO DO

  • Implement Unit-Test: Test-Driven Development (TDD)

Credit to

Reference

  • NanoDet: Super fast and lightweight anchor-free object detection model. here
  • Yunjey Choi - PyTorch Tutorial for Deep Learning Researchers here
  • Victor Huang - PyTorch Template Project (here)
Owner
eMHa
Sharing is caring. "What I can't create I don't understand".
eMHa
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022
Ocular is a state-of-the-art historical OCR system.

Ocular Ocular is a state-of-the-art historical OCR system. Its primary features are: Unsupervised learning of unknown fonts: requires only document im

228 Dec 30, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
Library used to deskew a scanned document

Deskew //Note: Skew is measured in degrees. Deskewing is a process whereby skew is removed by rotating an image by the same amount as its skew but in

Stéphane Brunner 273 Jan 06, 2023
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Microsoft 235 Dec 22, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 06, 2021
A tool to enhance your old/damaged pictures built using python & opencv.

Breathe Life into your Old Pictures Table of Contents About The Project Getting Started Prerequisites Usage Contact Acknowledgments About The Project

Shah Anwaar Khalid 5 Dec 16, 2021
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

tooraj taraz 3 Feb 10, 2022
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma

<a href=[email protected]"> 354 Jan 01, 2023
A Vietnamese personal card OCR website built with Django.

Django VietCardOCR Installation Creation of virtual environments is done by executing the command venv: python -m venv venv That will create a new fol

Truong Hoang Thuan 4 Sep 04, 2021
Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

CRNN_Tensorflow This is a TensorFlow implementation of a Deep Neural Network for scene text recognition. It is mainly based on the paper "An End-to-En

MaybeShewill-CV 1000 Dec 27, 2022
The first open-source library that detects the font of a text in a image.

Typefont Typefont is an experimental library that detects the font of a text in a image. Usage Import the main function and invoke it like in the foll

Vasile Pește 1.6k Feb 24, 2022
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022