PyVideoAI: Action Recognition Framework

Overview

This reposity contains official implementation of:

PyVideoAI: Action Recognition Framework

The only framework that completes your computer vision, action recognition research environment.

** Key features **

  • Supports multi-gpu, multi-node training.
  • STOA models such as I3D, Non-local, TSN, TRN, TSM, MVFNet, ..., and even ImageNet training!
  • Many datasets such as Kinetics-400, EPIC-Kitchens-55, Something-Something-V1/V2, HMDB-51, UCF-101, Diving48, CATER, ...
  • Supports both video decoding (straight from .avi/mp4) and frame extracted (.jpg/png) dataloaders, sparse-sample and dense-sample.
  • Any popular LR scheduling like Cosine Annealing with Warm Restart, Step LR, and Reduce LR on Plateau.
  • Early stopping when training doesn't improve (customise your condition)
  • Easily add custom model, optimiser, scheduler, loss and dataloader!
  • Telegram bot reporting experiment status.
  • TensorBoard reporting stats.
  • Colour logging
  • All of the above come with no extra setup. Trust me and try some examples.

** Papers implemented **

This package is motivated by PySlowFast from Facebook AI. The PySlowFast is a cool framework, but it depends too much on their config system and it was difficult to add new models (other codes) or reuse part of the modules from the framework.
This framework by Kiyoon, is designed to replace all the configuration systems to Python files, which enables easy-addition of custom models/LR scheduling/dataloader etc.
Just modify the function bodies in the config files!

Difference between the two config systems can be found in CONFIG_SYSTEM.md.

Getting Started

Jupyter Notebook examples to run:

  • HMDB-51 data preparation
  • Inference on pre-trained model from the model zoo, and visualise model/dataloader/per-class performance.
  • Training I3D using Kinetics pretrained model
  • Using image model and ImageNet dataset

is provided in the examples!

Structure

All of the executable files are in tools/.
dataset_configs/ directory configures datasets. For example, where is the dataset stored, number of classes, single-label or multi-label training, dataset-specific visualisation settings (confusion matrix has different output sizes)
model_configs/ directory configures model architectures. For example, model definition, input preprocessing mean/std.
exp_configs/ directory configures other training settings like optimiser, scheduling, dataloader, number of frames as input. The config file path has to be in exp_configs/[dataset_name]/[model_name]_[experiment_name].py format.

Usage

Preparing datasets

This package supports many action recognition datasets such as HMDB-51, EPIC-Kitchens-55, Something-Something-V1, CATER, etc.
Refer to DATASET.md.

Training command

CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/run_train.py -D {dataset_config_name} -M {model_config_name} -E {exp_config_name} --local_world_size {num_GPUs} -e {num_epochs}

--local_world_size denotes the number of GPUs per computing node.

Telegram Bot

You can preview experiment results using Telegram bots!
Telegram bot stat report example

If your code raises an exception, it will report you too.
Telegram error report example

You can quickly take a look at example video inputs (as GIF or JPEGs) from the dataloader.
Use tools/visualisations/model_and_dataloader_visualiser.py
Telegram video input report example

[Telegram0]
token=
chat_id=

Model Zoo and Baselines

Refer to MODEL_ZOO.md

Installation

Refer to INSTALL.md.

TL;DR,

conda create -n videoai python=3.8
conda activate videoai
conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=10.2 -c pytorch
### For RTX 30xx GPUs,
#conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.1 -c pytorch -c nvidia
 

git clone --recurse-submodules https://github.com/kiyoon/PyVideoAI.git
cd PyVideoAI
git checkout v0.3
git submodule update --recursive
cd submodules/video_datasets_api
pip install -e .
cd ../experiment_utils
pip install -e .
cd ../..
pip install -e .

Experiment outputs

The experiment results (log, training stats, weights, tensorboard, plots, etc.) are saved to data/experiments by default. This can be huge, so make sure you make a softlink of a directory you really want to use. (recommended)

Otherwise, you can change pyvideoai/config.py's DEFAULT_EXPERIMENT_ROOT value. Or, you can also set --experiment_root argument manually when executing.

Owner
Kiyoon Kim
Computer scientist with computer vision, machine learning and signal processing background.
Kiyoon Kim
A library for performing coverage guided fuzzing of neural networks

TensorFuzz: Coverage Guided Fuzzing for Neural Networks This repository contains a library for performing coverage guided fuzzing of neural networks,

Brain Research 195 Dec 28, 2022
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)

Score Transformer This is the official repository for "Score Transformer": Score Transformer: Generating Musical Scores from Note-level Representation

22 Dec 22, 2022
给yolov5加个gui界面,使用pyqt5,yolov5是5.0版本

博文地址 https://xugaoxiang.com/2021/06/30/yolov5-pyqt5 代码执行 项目中使用YOLOv5的v5.0版本,界面文件是project.ui pip install -r requirements.txt python main.py 图片检测 视频检测

Xu GaoXiang 215 Dec 30, 2022
Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation

tf-imle Tensorflow 2 and PyTorch implementation and Jupyter notebooks for Implicit Maximum Likelihood Estimation (I-MLE) proposed in the NeurIPS 2021

NEC Laboratories Europe 69 Dec 13, 2022
AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet buil

3.4k Jan 07, 2023
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Devsisters Corp. 2.4k Dec 26, 2022
BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine Learning

BEAS Blockchain Enabled Asynchronous and Secure Federated Machine Learning Default Network Configuration: The default application uses the HyperLedger

Harpreet Virk 11 Nov 20, 2022
This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Learning Invariant Representation for Unsupervised Image Restoration (CVPR 2020) Introduction This is an implementation for the paper "Learning Invari

GarField 88 Nov 07, 2022
Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

O-OFDMNet This includes Python implementation of O-OFDMNet, a deep learning-based optical OFDM system, which uses neural networks for signal processin

Thien Luong 4 Sep 09, 2022
VLGrammar: Grounded Grammar Induction of Vision and Language

VLGrammar: Grounded Grammar Induction of Vision and Language

Yining Hong 27 Dec 23, 2022
SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss This repository implements the SAFL in pytorch. Installation conda env create -f environm

6 Aug 24, 2022
Keras implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 8.9k Jan 04, 2023
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

Yutong Bai 145 Dec 01, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 06, 2023
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Code release for NeRF (Neural Radiance Fields)

NeRF: Neural Radiance Fields Project Page | Video | Paper | Data Tensorflow implementation of optimizing a neural representation for a single scene an

6.5k Jan 01, 2023
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

Joseph P. Robinson 41 Dec 12, 2022