Action Recognition for Self-Driving Cars

Overview

Action Recognition for Self-Driving Cars

demo img

This repo contains the codes for the 2021 Fall semester project "Action Recognition for Self-Driving Cars" at EPFL VITA lab. For experiment results, please refer to the project report and presenation slides at docs. A demo video is available here.

This project utilizes a simple yet effective architecture (called poseact) to classify multiple actions.

The model has been tested on three datasets, TCG, TITAN and CASR.

drawing

Preparation and Installation

This project mainly depends PyTorch. If you wish to start from extracting poses from images, you would also need OpenPifPaf (along with posetrack plugin), please also refer to this section for following steps. In case you wish to skip extracting your own poses, and directly start from the poses used in this repo, you can download this folder. It contains the poses extracted from TITAN and CASR dataset as well as a trained model for TITAN dataset. For the poses in TCG dataset, please refer to the official repo.

First, clone and install this repo. If you have downloaded the folder above, please put the contents to poseact/out/

Then clone this repo and install in editable mode.

git clone https://github.com/vita-epfl/pose-action-recognition.git
cd Action_Recognition
python -m pip install -e .

Project Structure and usage

poseact
	|___ data # create this folder to store your datasets, or create a symlink 
	|___ models 
	|___ test # debug tests, may also be helpful for basic usage
	|___ tools # preprocessing and analyzing tools, usage stated in the scripts 
	|___ utils # utility functions, such as datasets, losses and metrics 
	|___ xxxx_train.py # training scripts for TCG, TITAN and CASR
	|___ python_wrapper.sh # script for submitting jobs to EPFL IZAR cluster, same for debug.sh
	|___ predictor.py  # a visualization tool with the model trained on TITAN dataset 

It's advised to cd poseact and conda activate pytorch before running the experiments.

To submit jobs to EPFL IZAR cluster (or similar clusters managed by slurm), you can use the script python_wrapper.sh. Just think of it as "the python on the cluster". To submit to debug node of IZAR, you can use the debug.sh

Here is an example to train a model on TITAN dataset. --imbalance focal means using the focal loss, --gamma 0 sets the gamma value of focal loss to 0 (because I find 0 is better :=), --merge_cls means selecting a suitable set of actions from the original actions hierarchy, and--relative_kp means using relative coordinates of the keypoints, see the presentation slides for intuition. You can specify a name for this task with --task_name, which will be used to name the saved model if you use --save_model.

sbatch python_wrapper.sh titan_train.py --imbalance focal --gamma 0 --merge_cls --relative_kp --task_name Relative_KP --save_model

To use the temporal model, you can use --model_type sequence, and maybe you will need to adjust the number of epochs, batch size and learning rate. To use pifpaf track ID instead of ground truth track ID, you can use --track_method pifpaf .

sbatch python_wrapper.sh titan_train.py --model_type sequence --num_epoch 100 --imbalance focal --track_method gt --batch_size 128 --gamma 0 --lr 0.001

For all available training options, please refer to the comments and docstrings in the training scripts.

All the datasets have "train-validate-test" setup, so after the training, you should be able to see a summary of evaluation.

Here is an example

In general, overall accuracy 0.8614 avg Jaccard 0.6069 avg F1 0.7409

For valid_action actions accuracy 0.8614 Jaccard score 0.6069 f1 score 0.9192 mAP 0.7911
Precision for each class: [0.885 0.697 0.72  0.715 0.87]
Recall for each class: [0.956 0.458 0.831 0.549 0.811]
F1 score for each class: [0.919 0.553 0.771 0.621 0.839]
Average Precision for each class is [0.9687, 0.6455, 0.8122, 0.6459, 0.883]
Confusion matrix (elements in a row share the same true label, those in the same columns share predicted):
The corresponding classes are {'walking': 0, 'standing': 1, 'sitting': 2, 'bending': 3, 'biking': 4, 'motorcycling': 4}
[[31411  1172    19   142   120]
 [ 3556  3092    12    45    41]
 [   12     1   157     0    19]
 [  231   160     3   512    26]
 [  268     9    27    17  1375]]

After training and saving the model (to out/trained/), you can use the predictor to visualize results on TITAN (all sequences). Feel free to change the chekpoint to your own trained model, but only the file name is needed, because models are assumed to be out/trained

sbatch python_wrapper.sh predictor.py --function titanseqs --save_dir out/recognition --ckpt TITAN_Relative_KP803217.pth

It's also possible to run on a single sequence with --function titan_single --seq_idx <Number>

or run on a single image with --function image --image_path <path/to/your/image.png>

More about the TITAN dataset

For the TITAN dataset, we first extract poses from the images with OpenPifPaf, and then match the poses to groundtruth accoding to IOU of bounding boxes. After that, we store the poses sequence by sequence, frame by frame, person by person, and you will find corresponding classes in titan_dataset.py.

Preparing poses for TITAN and CASR

This part may be a bit cumbersome and it's advised to use the prepared poses in this folder. If you want to extract the poses yourself, please also download that folder, because poseact/out/titan_clip/example.png is needed as the input to OpenPifPaf.

First, install OpenPifPaf and the posetrack plugin.

For TITAN, download the dataset to poseact/data/TITAN and then

cd poseact
conda activate pytorch # activate the python environment
# run single frame pose detection , wait for the program to complete
sbatch python_wrapper.sh tools/run_pifpaf_on_titan.py --mode single --n_process 6
# run pose tracking, required for temporal model with pifpaf track ID, wait for the program to complete
sbatch python_wrapper.sh tools/run_pifpaf_on_titan.py --mode track --n_process 6
# make the pickle file for single frame model 
python utils/titan_dataset.py --function pickle --mode single
# make the pickle file from pifpaf posetrack result
python utils/titan_dataset.py --function pickle --mode track 

For CASR, you should agree with the terms and conditions required by the authors of CASR

CASR dataset needs some preprocessing, please create the folder poseact/scratch (or link to the scratch on IZAR) and then

cd poseact
conda activate pytorch # activate the python environment
sbatch tools/casr_download.sh # wait for the whole process to complete, takes a long time 
sbatch python_wrapper.sh tools/run_pifpaf_on_casr.py --n_process 6 # wait for this process to complete, again a long time 
python ./utils/casr_dataset.py # now you should have the file out/CASR_pifpaf.pkl

Credits

The poses are extracted with OpenPifPaf.

The model is inspired by MonoLoco and the heuristics are from this work

The code for TCG dataset is adopted from the official repo.

Owner
VITA lab at EPFL
Visual Intelligence for Transportation
VITA lab at EPFL
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Bin Xiao 175 Jan 08, 2023
[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

CxGrad - Official PyTorch Implementation Contextual Gradient Scaling for Few-Shot Learning Sanghyuk Lee, Seunghyun Lee, and Byung Cheol Song In WACV 2

Sanghyuk Lee 4 Dec 05, 2022
Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx

Arsenii Senya Ashukha 80 Sep 17, 2022
A fast, dataset-agnostic, deep visual search engine for digital art history

imgs.ai imgs.ai is a fast, dataset-agnostic, deep visual search engine for digital art history based on neural network embeddings. It utilizes modern

Fabian Offert 5 Dec 14, 2022
Official Pytorch implementation of C3-GAN

Official pytorch implemenation of C3-GAN Contrastive Fine-grained Class Clustering via Generative Adversarial Networks [Paper] Authors: Yunji Kim, Jun

NAVER AI 114 Dec 02, 2022
Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

John Hopcroft Lab at HUST 10 Sep 26, 2022
Using deep learning model to detect breast cancer.

Breast-Cancer-Detection Breast cancer is the most frequent cancer among women, with around one in every 19 women at risk. The number of cases of breas

1 Feb 13, 2022
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks Code for “Efficient Sharpness-aware Minimization for Improved Training

Angusdu 32 Oct 18, 2022
Recommendationsystem - Movie-recommendation - matrixfactorization colloborative filtering recommendation system user

recommendationsystem matrixfactorization colloborative filtering recommendation

kunal jagdish madavi 1 Jan 01, 2022
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 본 repo 는 mAy-I Inc. 팀으로 참가한 2021 인공지능 온라인 경진대회 중 [이미지] 운전 사고 예방을 위한 운전자 부주의 행동 검출 모델] 태스크 수행을 위한 레포지토리입니다. mAy-I 는 과학기술정보통신부가 주최하

Junhyuk Park 9 Dec 01, 2022
GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management

Bitcoin Realized Volatility Forecasting with GARCH and Multivariate LSTM Author: Chi Bui This Repository Repository Directory ├── README.md

Chi Bui 113 Dec 29, 2022
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 49 Nov 28, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

DV Lab 182 Dec 29, 2022
Implementation of algorithms for continuous control (DDPG and NAF).

DEPRECATION This repository is deprecated and is no longer maintaned. Please see a more recent implementation of RL for continuous control at jax-sac.

Ilya Kostrikov 288 Dec 31, 2022
Geometric Sensitivity Decomposition

Geometric Sensitivity Decomposition This repo is the official implementation of A Geometric Perspective towards Neural Calibration via Sensitivity Dec

16 Dec 26, 2022
Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a

NASA Jet Propulsion Laboratory 8 Nov 06, 2022
Auto-updating data to assist in investment to NEPSE

Symbol Ratios Summary Sector LTP Undervalued Bonus % MEGA Strong Commercial Banks 368 5 10 JBBL Strong Development Banks 568 5 10 SIFC Strong Finance

Amit Chaudhary 16 Nov 01, 2022
GPT, but made only out of gMLPs

GPT - gMLP This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will

Phil Wang 80 Dec 01, 2022
Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Distribution-Balanced Loss [Paper] The implementation of our paper Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets (

Tong WU 304 Dec 22, 2022
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

Tsubota 2 Nov 20, 2021