Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Overview

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

Introduction

This repo is official PyTorch implementation of IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos (CVPRW 2021).

Directory

Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- common  
|-- main  
|-- tool
|-- output  
  • data contains data loading codes and soft links to images and annotations directories.
  • common contains kernel codes for IntegralAction.
  • main contains high-level codes for training or testing the network.
  • tool contains a code to merge models of rgb_only and pose_only stages.
  • output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${ROOT}  
|-- data  
|   |-- Kinetics
|   |   |-- data
|   |   |   |-- frames 
|   |   |   |-- kinetics-skeleton
|   |   |   |-- Kinetics50_train.json
|   |   |   |-- Kinetics50_val.json
|   |   |   |-- Kinetics400_train.json
|   |   |   |-- Kinetics400_val.json
|   |-- Mimetics
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- pose_results 
|   |   |   |-- Mimetics50.json
|   |   |   |-- Mimetics400.json
|   |-- NTU
|   |   |-- data  
|   |   |   |-- frames 
|   |   |   |-- nturgb+d_skeletons
|   |   |   |-- NTU_train.json
|   |   |   |-- NTU_test.json

To download multiple files from Google drive without compressing them, try this. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.

* Go the shared folder, which contains files you want to copy to your drive  
* Select all the files you want to copy  
* In the upper right corner click on three vertical dots and select “make a copy”  
* Then, the file is copied to your personal google drive account. You can download it from your personal account.  

Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis  
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

Running IntegralAction

Start

  • Install PyTorch and Python >= 3.7.3 and run sh requirements.sh.
  • In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.
  • There are three stages. 1) rgb_only , 2) pose_only, and 3) rgb+pose. In the rgb_only stage, only RGB stream is trained, and in the pose_only stage, only pose stream is trained. Finally, rgb+pose stage initializes weights from the previous two stages and continue training by the pose-drive integration.

Train

1. rgb_only stage

In the main folder, run

python train.py --gpu 0-3 --mode rgb_only

to train IntegralAction in the rgb_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3. Then, backup the trained weights by running

mkdir ../output/model_dump/rgb_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/rgb_only/.

2. pose_only stage

In the main folder, run

python train.py --gpu 0-3 --mode pose_only

to train IntegralAction in the pose_only stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.
Then, backup the trained weights by running

mkdir ../output/model_dump/pose_only
mv ../output/model_dump/snapshot_*.pth.tar ../output/model_dump/pose_only/.

3. rgb+pose stage

In the tool folder, run

cp ../output/model_dump/rgb_only/snapshot_29.pth.tar snapshot_29_rgb_only.pth.tar
cp ../output/model_dump/pose_only/snapshot_29.pth.tar snapshot_29_pose_only.pth.tar
python merge_rgb_only_pose_only.py
mv snapshot_0.pth.tar ../output/model_dump/.

In the main folder, run

python train.py --gpu 0-3 --mode rgb+pose --continue

to train IntegralAction in the rgb+pose stage on the GPU 0,1,2,3. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Test

Place trained model at the output/model_dump/. Choose the stage you want to test from one of [rgb_only, pose_only, rgb+pose].

In the main folder, run

python test.py --gpu 0-3 --mode $STAGE --test_epoch 29

to test IntegralAction in $STAGE stage (should be one of [rgb_only, pose_only, rgb+pose]) on the GPU 0,1,2,3 with 29th epoch trained model. --gpu 0,1,2,3 can be used instead of --gpu 0-3.

Results

Here I report the performance of the IntegralAction.

Kinetics50

  • Download IntegralAction trained on [Kinetics50].
  • Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.
(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:48:25 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:48:25 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:48:25 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 773/773 [03:09<00:00,  5.11it/s]
Evaluation start...
Top-1 accuracy: 72.2087
Top-5 accuracy: 92.2735
Result is saved at: ../output/result/kinetics_result.json

Mimetics

  • Download IntegralAction trained on [Kinetics50].
  • Kinetics50 is a subset of Kinetics400. It mainly contains videos with human motion-related action classes, sampled from Kinetics400.
  • Note that Mimetics is used only for the testing purpose.
(base) mks0601:~/workspace/IntegralAction/main$ python test.py --gpu 5-6 --mode rgb+pose --test_epoch 29
>>> Using GPU: 5,6
04-15 11:52:20 Creating dataset...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
04-15 11:52:20 Load checkpoint from ../output/model_dump/snapshot_29.pth.tar
04-15 11:52:20 Creating graph...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 187/187 [02:14<00:00,  4.93it/s]
Evaluation start...
Top-1 accuracy: 26.5101
Top-5 accuracy: 50.5034
Result is saved at: ../output/result/mimetics_result.json

Reference

@InProceedings{moon2021integralaction,
  title={IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos},
  author={Moon, Gyeongsik and Kwon, Heeseung and Lee, Kyoung Mu and Cho, Minsu},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW)}, 
  year={2021}
}
Owner
Gyeongsik Moon
Postdoc in CVLAB, SNU, Korea
Gyeongsik Moon
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

Facebook Research 85 Jan 02, 2023
Experiments for Fake News explainability project

fake-news-explainability Experiments for fake news explainability project This repository only contains the notebooks used to train the models and eva

Lorenzo Flores (Lj) 1 Dec 03, 2022
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
MDMM - Learning multi-domain multi-modality I2I translation

Multi-Domain Multi-Modality I2I translation Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to

Hsin-Ying Lee 107 Nov 04, 2022
[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.

FFB6D This is the official source code for the CVPR2021 Oral work, FFB6D: A Full Flow Biderectional Fusion Network for 6D Pose Estimation. (Arxiv) Tab

Yisheng (Ethan) He 201 Dec 28, 2022
A concise but complete implementation of CLIP with various experimental improvements from recent papers

x-clip (wip) A concise but complete implementation of CLIP with various experimental improvements from recent papers Install $ pip install x-clip Usag

Phil Wang 515 Dec 26, 2022
Unified tracking framework with a single appearance model

Paper: Do different tracking tasks require different appearance model? [ArXiv] (comming soon) [Project Page] (comming soon) UniTrack is a simple and U

ZhongdaoWang 300 Dec 24, 2022
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities

MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex

Google Cloud Platform 238 Dec 21, 2022
Hyperbolic Image Segmentation, CVPR 2022

Hyperbolic Image Segmentation, CVPR 2022 This is the implementation of paper Hyperbolic Image Segmentation (CVPR 2022). Repository structure assets :

Mina Ghadimi Atigh 46 Dec 29, 2022
Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images In this paper, we present an effective Dynamic Enhancement Anchor

13 Dec 09, 2022
The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

Curriculum by Smoothing (NeurIPS 2020) The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight). For any questions reg

PAIR Lab 36 Nov 23, 2022
An end-to-end image translation model with weight-map for color constancy

CCUnet An end-to-end image translation model with weight-map for color constancy 1. Download the dataset (take Colorchecker_recommended dataset as an

Jianhui Qiu 1 Dec 21, 2021
Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

S2AND This repository provides access to the S2AND dataset and S2AND reference model described in the paper S2AND: A Benchmark and Evaluation System f

AI2 54 Nov 28, 2022
App customer segmentation cohort rfm clustering

CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat

hieulmsc 3 Dec 18, 2021
Open source hardware and software platform to build a small scale self driving car.

Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions.

Autorope 2.4k Jan 04, 2023
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022
基于tensorflow 2.x的图片识别工具集

Classification.tf2 基于tensorflow 2.x的图片识别工具集 功能 粗粒度场景图片分类 细粒度场景图片分类 其他场景图片分类 模型部署 tensorflow serving本地推理和docker部署 tensorRT onnx ... 数据集 https://hyper.a

Wei Qi 1 Nov 03, 2021
source code of “Visual Saliency Transformer” (ICCV2021)

Visual Saliency Transformer (VST) source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, an

89 Dec 21, 2022
JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

JudeasRX Instructions Read the references given in the Theory and Notation section below Fire up the Jupyter Notebook judeas-rx.ipynb The notebook dra

Robert R. Tucci 19 Nov 07, 2022