Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Last update: Dec 29, 2022

Related tags

Deep Learning PRP

Overview

PRP

Introduction

This is the implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Getting started

Install

Our experiments run on Python 3.6.1 and PyTorch 0.4.1. All dependencies can be installed using pip:
```
python -m pip install -r requirements.txt
```

Data preparation

We construct experiments on UCF101 and HMDB51 (the split1 of UCF101 for pre-training and the rest for fine-tuning). The expected dataset directory hierarchy is as follow:

├── UCF101/HMDB51
│   ├── split
│   │   ├── classInd.txt
│   │   ├── testlist01.txt
│   │   ├── trainlist01.txt
│   │   └── ...
│   └── video
│       ├── ApplyEyeMakeup
│       │   └── *.avi
│       └── ...
└── ...

Train and Test Pre-training on Pretext Task

python train_predict.py --gpu 0 --epoch 300 --model_name c3d/r21d/r3d

Action Recognition

python ft_classfy.py --gpu 0 --model_name c3d/r21d/r3d --pre_path [your pre-trained model] --split 1/2/3
python test_classify.py

Video Retrieval

Please refer to the code video_retrieval_samples.py of VCOP.

Model zoo

Models

Pre-trained PRP model on the split1 of UCF101: C3D(OneDrive); R3D(OneDrive); R(2+1)D(OneDrive)
Action Recognition Results

Architecture UCF101(%) HMDB51(%)

C3D 69.1 34.5

R3D 66.5 29.7

R(2+1)D 72.1 35.0

Architecture	UCF101(%)	HMDB51(%)
C3D	69.1	34.5
R3D	66.5	29.7
R(2+1)D	72.1	35.0

License

This project is released under the Apache 2.0 license.

Citation

Please cite the following paper if you feel RSPNet useful to your research

@InProceedings{Yao_2020_CVPR,  
author = {Yao, Yuan and Liu, Chang and Luo, Dezhao and Zhou, Yu and Ye, Qixiang},  
title = {Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning},  
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},  
month = {June},  
year = {2020}  
}

Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Related tags

Overview

PRP

Introduction

Getting started

Model zoo

License

Citation

Owner

yuanyao366

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your username and app/website.

Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

BankNote-Net: Open dataset and encoder model for assistive currency recognition

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Robotic Process Automation in Windows and Linux by using Driagrams.net BPMN diagrams.

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Catch-all collection of generative art made using processing

Template repository for managing machine learning research projects built with PyTorch-Lightning

CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

RLBot Python bindings for the Rust crate rl_ball_sym

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation