Benchmarking the robustness of Spatial-Temporal Models

Last update: Dec 16, 2022

Overview

Benchmarking the robustness of Spatial-Temporal Models

This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions.

Python 2.7 and 3.7, Pytorch 1.7+, FFmpeg are required.

Requirements

pip3 install - requirements.txt

Mini Kinetics-C

Download original Kinetics400 from link.

The Mini Kinetics-C contains half of the classes in Kinetics400. All the classes can be found in mini-kinetics-200-classes.txt.

Mini Kinetics-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini Kinetics and evaluated on Mini Kinetics-C.

Approach	Reference	Backbone	Input Length	Sampling Method	Clean Accuracy	mPC	rPC
TimeSformer	Gedas et al.	Transformer	32	Uniform	82.2	71.4	86.9
3D ResNet	K. Hara et al.	ResNet-50	32	Uniform	73.0	59.2	81.1
I3D	J. Carreira et al.	InceptionV1	32	Uniform	70.5	57.7	81.8
SlowFast 8x4	C. Feichtenhofer at al.	ResNet-50	32	Uniform	69.2	54.3	78.5
3D ResNet	K. Hara et al.	ResNet-18	32	Uniform	66.2	53.3	80.5
TAM	Q.Fan et al.	ResNet-50	32	Uniform	66.9	50.8	75.9
X3D-M	C. Feichtenhofer	ResNet-50	32	Uniform	62.6	48.6	77.6

For fair comparison, it is recommended to submit the result of approach which follows the following settings: Backbone of ResNet-50, Input Length of 32, Uniform Sampling at Clip Level. Any result on our benchmark can be submitted via pull request.

Mini SSV2-C

Download original Something-Something-V2 datset from link.

The Mini SSV2-C contains half of the classes in Something-Something-V2. All the classes can be found in mini-ssv2-87-classes.txt.

Mini SSV2-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini SSV2 and evaluated on Mini SSV2-C.

Approach	Reference	Backbone	Input Length	Sampling Method	Clean Accuracy	mPC	rPC
TimeSformer	Gedas et al.	Transformer	16	Uniform	60.5	49.7	82.1
I3D	J. Carreira et al.	InceptionV1	32	Uniform	58.5	47.8	81.7
3D ResNet	K. Hara et al.	ResNet-50	32	Uniform	57.4	46.6	81.2
TAM	Q.Fan et al.	ResNet-50	32	Uniform	61.8	45.7	73.9
3D ResNet	K. Hara et al.	ResNet-18	32	Uniform	53.0	42.6	80.3
X3D-M	C. Feichtenhofer	ResNet-50	32	Uniform	49.9	40.7	81.6
SlowFast 8x4	C. Feichtenhofer at al.	ResNet-50	32	Uniform	48.7	38.4	78.8

Training and Evaluation

To help researchers reproduce the benchmark results provided in our leaderboard, we include a simple framework for training and evaluating the spatial-temporal models in the folder: benchmark_framework.

Running the code

Assume the structure of data directories is the following:

~/
  datadir/
    mini_kinetics/
      train/
        .../ (directories of class names)
          ...(hdf5 file containing video frames)
    mini_kinetics-c/
      .../ (directories of corruption names)
        .../ (directories of severity level)
          .../ (directories of class names)
            ...(hdf5 file containing video frames)

Train I3D on the Mini Kinetics dataset with 4 GPUs and 16 CPU threads (for data loading). The input lenght is 32, the batch size is 32 and learning rate is 0.01.

python3 train.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--lr 0.01 --backbone_net i3d -b 32 -j 16 --cuda 0,1,2,3

Test I3D on the Mini Kinetics-C dataset (pretrained model is loaded)

python3 test_corruption.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--pretrained snapshots/mini_kinetics400-rgb-i3d_v2-ts-max-f32-cosine-bs32-e50-v1/model_best.pth.tar --backbone_net i3d -b 32 -j 16 -e --cuda 0,1,2,3

Benchmarking the robustness of Spatial-Temporal Models

Related tags

Overview

Benchmarking the robustness of Spatial-Temporal Models

Requirements

Mini Kinetics-C

Mini Kinetics-C Leaderboard

Mini SSV2-C

Mini SSV2-C Leaderboard

Training and Evaluation

Running the code

Owner

Yi Chenyu Ian

Malmo Collaborative AI Challenge - Team Pig Catcher

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务

A LiDAR point cloud cluster for panoptic segmentation

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Age and Gender prediction using Keras

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

An Implementation of Fully Convolutional Networks in Tensorflow.

PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

A full-fledged version of Pix2Seq

Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

Learning What and Where to Draw

TC-GNN with Pytorch integration

Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

masscan + nmap + Finger

Hyper-parameter optimization for sklearn

Benchmarking the robustness of Spatial-Temporal Models

Related tags

Overview

Benchmarking the robustness of Spatial-Temporal Models

Requirements

Mini Kinetics-C

Mini Kinetics-C Leaderboard

Mini SSV2-C

Mini SSV2-C Leaderboard

Training and Evaluation

Running the code

Owner

Yi Chenyu Ian

Malmo Collaborative AI Challenge - Team Pig Catcher

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

A LiDAR point cloud cluster for panoptic segmentation

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Age and Gender prediction using Keras

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

An Implementation of Fully Convolutional Networks in Tensorflow.

PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

A full-fledged version of Pix2Seq

Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

Learning What and Where to Draw

TC-GNN with Pytorch integration

Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

masscan + nmap + Finger

Hyper-parameter optimization for sklearn

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务