Benchmarking the robustness of Spatial-Temporal Models

Overview

Benchmarking the robustness of Spatial-Temporal Models

This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions.

Python 2.7 and 3.7, Pytorch 1.7+, FFmpeg are required.

Requirements

pip3 install - requirements.txt

Mini Kinetics-C

image info

Download original Kinetics400 from link.

The Mini Kinetics-C contains half of the classes in Kinetics400. All the classes can be found in mini-kinetics-200-classes.txt.

Mini Kinetics-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini Kinetics and evaluated on Mini Kinetics-C.

Approach Reference Backbone Input Length Sampling Method Clean Accuracy mPC rPC
TimeSformer Gedas et al. Transformer 32 Uniform 82.2 71.4 86.9
3D ResNet K. Hara et al. ResNet-50 32 Uniform 73.0 59.2 81.1
I3D J. Carreira et al. InceptionV1 32 Uniform 70.5 57.7 81.8
SlowFast 8x4 C. Feichtenhofer at al. ResNet-50 32 Uniform 69.2 54.3 78.5
3D ResNet K. Hara et al. ResNet-18 32 Uniform 66.2 53.3 80.5
TAM Q.Fan et al. ResNet-50 32 Uniform 66.9 50.8 75.9
X3D-M C. Feichtenhofer ResNet-50 32 Uniform 62.6 48.6 77.6

For fair comparison, it is recommended to submit the result of approach which follows the following settings: Backbone of ResNet-50, Input Length of 32, Uniform Sampling at Clip Level. Any result on our benchmark can be submitted via pull request.

Mini SSV2-C

image info

Download original Something-Something-V2 datset from link.

The Mini SSV2-C contains half of the classes in Something-Something-V2. All the classes can be found in mini-ssv2-87-classes.txt.

Mini SSV2-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini SSV2 and evaluated on Mini SSV2-C.

Approach Reference Backbone Input Length Sampling Method Clean Accuracy mPC rPC
TimeSformer Gedas et al. Transformer 16 Uniform 60.5 49.7 82.1
I3D J. Carreira et al. InceptionV1 32 Uniform 58.5 47.8 81.7
3D ResNet K. Hara et al. ResNet-50 32 Uniform 57.4 46.6 81.2
TAM Q.Fan et al. ResNet-50 32 Uniform 61.8 45.7 73.9
3D ResNet K. Hara et al. ResNet-18 32 Uniform 53.0 42.6 80.3
X3D-M C. Feichtenhofer ResNet-50 32 Uniform 49.9 40.7 81.6
SlowFast 8x4 C. Feichtenhofer at al. ResNet-50 32 Uniform 48.7 38.4 78.8

For fair comparison, it is recommended to submit the result of approach which follows the following settings: Backbone of ResNet-50, Input Length of 32, Uniform Sampling at Clip Level. Any result on our benchmark can be submitted via pull request.

Training and Evaluation

To help researchers reproduce the benchmark results provided in our leaderboard, we include a simple framework for training and evaluating the spatial-temporal models in the folder: benchmark_framework.

Running the code

Assume the structure of data directories is the following:

~/
  datadir/
    mini_kinetics/
      train/
        .../ (directories of class names)
          ...(hdf5 file containing video frames)
    mini_kinetics-c/
      .../ (directories of corruption names)
        .../ (directories of severity level)
          .../ (directories of class names)
            ...(hdf5 file containing video frames)

Train I3D on the Mini Kinetics dataset with 4 GPUs and 16 CPU threads (for data loading). The input lenght is 32, the batch size is 32 and learning rate is 0.01.

python3 train.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--lr 0.01 --backbone_net i3d -b 32 -j 16 --cuda 0,1,2,3

Test I3D on the Mini Kinetics-C dataset (pretrained model is loaded)

python3 test_corruption.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--pretrained snapshots/mini_kinetics400-rgb-i3d_v2-ts-max-f32-cosine-bs32-e50-v1/model_best.pth.tar --backbone_net i3d -b 32 -j 16 -e --cuda 0,1,2,3

Owner
Yi Chenyu Ian
Yi Chenyu Ian
Malmo Collaborative AI Challenge - Team Pig Catcher

The Malmo Collaborative AI Challenge - Team Pig Catcher Approach The challenge involves 2 agents who can either cooperate or defect. The optimal polic

Kai Arulkumaran 66 Jun 29, 2022
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

Train neural network for semantic segmentation (deep lab V3) with pytorch in 50 lines of code Train net semantic segmentation net using Trans10K datas

17 Dec 19, 2022
“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

ccks2021-track3 CCKS2021中文NLP地址相关性任务-赛道三-冠军方案 团队:我的加菲鱼- wodejiafeiyu 初赛第二/复赛第一/决赛第一 前言 19年开始,陆陆续续参加了一些比赛,拿到过一些top,比较懒一直都没分享过,这次比较幸运又拿了top1,打算分享下 分类的任务

shaochenjie 131 Dec 31, 2022
A LiDAR point cloud cluster for panoptic segmentation

Divide-and-Merge-LiDAR-Panoptic-Cluster A demo video of our method with semantic prior: More information will be coming soon! As a PhD student, I don'

YimingZhao 65 Dec 22, 2022
A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration. Introduction spinor-gpe is high-level,

2 Sep 20, 2022
Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Real-time RGBD-based Extended Body Pose Estimation This repository is a real-time demo for our paper that was published at WACV 2021 conference The ou

Renat Bashirov 118 Dec 26, 2022
Age and Gender prediction using Keras

cnn_age_gender Age and Gender prediction using Keras Dataset example : Description : UTKFace dataset is a large-scale face dataset with long age span

XN3UR0N 58 May 03, 2022
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

308 Jan 04, 2023
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来

44 Sep 15, 2022
Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

Trieu 6.1k Jan 04, 2023
An Implementation of Fully Convolutional Networks in Tensorflow.

Update An example on how to integrate this code into your own semantic segmentation pipeline can be found in my KittiSeg project repository. tensorflo

Marvin Teichmann 1.1k Dec 12, 2022
PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

Smoothed Mutual Information ``Lower Bound'' Estimator PyTorch implementation for the ICLR 2020 paper Understanding the Limitations of Variational Mutu

50 Nov 09, 2022
A full-fledged version of Pix2Seq

Stable-Pix2Seq A full-fledged version of Pix2Seq What it is. This is a full-fledged version of Pix2Seq. Compared with unofficial-pix2seq, stable-pix2s

peng gao 205 Dec 27, 2022
Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

This repository contains the scripts needed to generate the results from the paper Unity Propagation in Bayesian Networks Handling Inconsistency via U

0 Jan 19, 2022
Learning What and Where to Draw

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee This is the code for our NIPS 201

Scott Ellison Reed 337 Nov 18, 2022
TC-GNN with Pytorch integration

TC-GNN (Running Sparse GNN on Dense Tensor Core on Ampere GPU) Cite this project and paper. @inproceedings{TC-GNN, title={TC-GNN: Accelerating Spars

YUKE WANG 19 Dec 01, 2022
Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

ConSeC is a novel approach to Word Sense Disambiguation (WSD), accepted at EMNLP 2021. It frames WSD as a text extraction task and features a feedback loop strategy that allows the disambiguation of

Sapienza NLP group 36 Dec 13, 2022
masscan + nmap + Finger

说明 个人根据使用习惯修改masnmap而来的一个小工具。调用masscan做全端口扫描,再调用nmap做服务识别,最后调用Finger做Web指纹识别。工具使用场景适合风险探测排查、众测等。 使用方法 安装依赖 pip3 install -r requirements.txt -i https:/

Ryan 3 Mar 25, 2022
Hyper-parameter optimization for sklearn

hyperopt-sklearn Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn. See how to use hyperopt-sklearn

1.4k Jan 01, 2023