[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

Overview

DSM

The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

Project Website;

Datasets list and some visualizations/provided weights are preparing now.

1. Introduction (scene-dominated to motion-dominated)

Video datasets are usually scene-dominated, We propose to decouple the scene and the motion (DSM) with two simple operations, so that the model attention towards the motion information is better paid.

The generated triplet is as below:

What DSM learned?

With DSM pretrain, the model learn to focus on motion region (Not necessarily actor) powerful without one label available.

2. Installation

Dataset

Please refer dataset.md for details.

Requirements

  • Python3
  • pytorch1.1+
  • PIL
  • Intel (on the fly decode)

3. Structure

  • datasets
    • list
      • hmdb51: the train/val lists of HMDB51
      • ucf101: the train/val lists of UCF101
      • kinetics-400: the train/val lists of kinetics-400
      • diving48: the train/val lists of diving48
  • experiments
    • logs: experiments record in detials
    • gradientes: grad check
    • visualization:
  • src
    • data: load data
    • loss: the loss evaluate in this paper
    • model: network architectures
    • scripts: train/eval scripts
    • augment: detail implementation of Spatio-temporal Augmentation
    • utils
    • feature_extract.py: feature extractor given pretrained model
    • main.py: the main function of finetune
    • trainer.py
    • option.py
    • pt.py: self-supervised pretrain
    • ft.py: supervised finetune

DSM(Triplet)/DSM/Random

Self-supervised Pretrain

Kinetics
bash scripts/kinetics/pt.sh
UCF101
bash scripts/ucf101/pt.sh

Supervised Finetune (Clip-level)

HMDB51
bash scripts/hmdb51/ft.sh
UCF101
bash scripts/ucf101/ft.sh
Kinetics
bash scripts/kinetics/ft.sh

Video-level Evaluation

Following common practice TSN and Non-local. The final video-level result is average by 10 temporal window sampling + corner crop, which lead to better result than clip-level. Refer test.py for details.

Pretrain And Eval In one step

bash scripts/hmdb51/pt_and_ft_hmdb51.sh

Notice: More Training Options and ablation study Can be find in scripts

Video Retrieve and other visualization

(1). Feature Extractor

As STCR can be easily extend to other video representation task, we offer the scripts to perform feature extract.

python feature_extractor.py

The feature will be saved as a single numpy file in the format [video_nums,features_dim] for further visualization.

(2). Reterival Evaluation

modify line60-line62 in reterival.py.

python reterival.py

Results

Action Recognition

UCF101 Pretrained (I3D)

Method UCF101 HMDB51
Random Initialization 47.9 29.6
MoCo Baseline 62.3 36.5
DSM(Triplet) 70.7 48.5
DSM 74.8 52.5

Kinetics Pretrained

Video Retrieve (UCF101-C3D)

Method @1 @5 @10 @20 @50
DSM 16.8 33.4 43.4 54.6 70.7

Video Retrieve (HMDB51-C3D)

Method @1 @5 @10 @20 @50
DSM 8.2 25.9 38.1 52.0 75.0

More Visualization

Acknowledgement

This work is partly based on STN, UEL and MoCo.

License

Citation

If you use our code in your research or wish to refer to the baseline results, pleasuse use the followint BibTex entry.

@inproceedings{wang2020enhancing,
  author    = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},
  title     = {Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion},
  booktitle = {AAAI},
  year      = {2021},
}
Owner
Jinpeng Wang
Focus on Biometrics and Video Understanding, Self/Semi Supervised Learning.
Jinpeng Wang
The CLRS Algorithmic Reasoning Benchmark

Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.

DeepMind 251 Jan 05, 2023
tsflex - feature-extraction benchmarking

tsflex - feature-extraction benchmarking This repository withholds the benchmark results and visualization code of the tsflex paper and toolkit. Flow

PreDiCT.IDLab 5 Mar 25, 2022
Deep Learning with PyTorch made easy 🚀 !

Deep Learning with PyTorch made easy 🚀 ! Carefree? carefree-learn aims to provide CAREFREE usages for both users and developers. It also provides a c

381 Dec 22, 2022
A Kitti Road Segmentation model implemented in tensorflow.

KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark

Marvin Teichmann 890 Jan 04, 2023
official implemntation for "Contrastive Learning with Stronger Augmentations"

CLSA CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations. Copyright (C) 2020 Xiao Wang, Guo-Jun

Lab for MAchine Perception and LEarning (MAPLE) 47 Nov 29, 2022
Learning from graph data using Keras

Steps to run = Download the cora dataset from this link : https://linqs.soe.ucsc.edu/data unzip the files in the folder input/cora cd code python eda

Mansar Youness 64 Nov 16, 2022
Concept drift monitoring for HA model servers.

{Fast, Correct, Simple} - pick three Easily compare training and production ML data & model distributions Goals Boxkite is an instrumentation library

98 Dec 15, 2022
Learning from Synthetic Humans, CVPR 2017

Learning from Synthetic Humans (SURREAL) Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev and Cordelia Schmid,

Gul Varol 538 Dec 18, 2022
SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

SparseInst 🚀 A simple framework for real-time instance segmentation, CVPR 2022 by Tianheng Cheng, Xinggang Wang†, Shaoyu Chen, Wenqiang Zhang, Qian Z

Hust Visual Learning Team 458 Jan 05, 2023
Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

On Generating Transferable Targeted Perturbations (ICCV'21) Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Fatih Porikli Paper:

Muzammal Naseer 46 Nov 17, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
Training vision models with full-batch gradient descent and regularization

Stochastic Training is Not Necessary for Generalization -- Training competitive vision models without stochasticity This repository implements trainin

Jonas Geiping 32 Jan 06, 2023
本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。

说明 本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。 python依赖 tf2.3 、cv2、numpy、pyqt5 pyqt5安装 pip install PyQt5 pip install PyQt5-tools 使用 程

4 May 04, 2022
Attentional Focus Modulates Automatic Finger‑tapping Movements

"Attentional Focus Modulates Automatic Finger‑tapping Movements", in Scientific Reports

Xingxun Jiang 1 Dec 02, 2021
PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning [Project Page] [Paper] Wenlong Huang1, Igor Mordatch2, Pieter Abbeel1,

Wenlong Huang 40 Nov 22, 2022
An unsupervised learning framework for depth and ego-motion estimation from monocular videos

SfMLearner This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghui Zhou, Matthew

Tinghui Zhou 1.8k Dec 30, 2022
A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

George Gunter 4 Nov 14, 2022
On Effective Scheduling of Model-based Reinforcement Learning

On Effective Scheduling of Model-based Reinforcement Learning Code to reproduce the experiments in On Effective Scheduling of Model-based Reinforcemen

laihang 8 Oct 07, 2022
The official code repository for examples in the O'Reilly book 'Generative Deep Learning'

Generative Deep Learning Teaching Machines to paint, write, compose and play The official code repository for examples in the O'Reilly book 'Generativ

David Foster 1.3k Dec 29, 2022