[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Last update: Oct 26, 2022

Related tags

Deep Learning SSVC

Overview

SSVC

The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning]

samples of the generated motion-preserved video with threshold $\alpha=0.5$.

Requirements

python3
torch1.1+
PIL
FrEIA==0.2 (Flow-based model)
lintel==1.0 (Decode mp4 videos on the fly)

Structure

backbone
data
- lists: train/val lists (.txt)
- augmentation.py: train/val data augmentation during ssl pre-training
- vDataLoader.py: custom your path to data list
model
- advflow: flow-based model
- classifier.py: linear classifier for down-stream tasks
- infonce.py: combine S$^2$VC with MoCo
flow
- pre-trained flow-based model weights
utils
main_pretrain.py: the main function for self-supervised pretrain
main_eval.py: the main function for supervised fine-tune

Self-supervised Pretrain

DDP

python -m torch.distributed.launch --nproc_per_node=1 --master_port 1234 main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0,1 --dataset XX

Single GPU

python main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0 --dataset XX

Evaluation

NN-Retrieval

python main_eval.py --retrieval --test SSL_Pt_Model_PTH --dataset XX --gpu X

Finetune

# fine-tune overall model
python main_eval.py --train_what ft --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

# freeze backbone, finetune last layer
python main_eval.py --train_what last --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

Test

python main_eval.py --train_what XX --ten_crop --test Sup_Ft_Model_PTH --gpu X \
--dataset XX --net r3d18 --img_dim 224 --seq_len 32

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Related tags

Overview

SSVC

Requirements

Structure

Self-supervised Pretrain

DDP

Single GPU

Evaluation

NN-Retrieval

Finetune

Test

Owner

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

A Unified Framework and Analysis for Structured Knowledge Grounding

ByteTrack超详细教程！训练自己的数据集&&摄像头实时检测跟踪

UIUCTF 2021 Public Challenge Repository

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

[SDM 2022] Towards Similarity-Aware Time-Series Classification

HIVE: Evaluating the Human Interpretability of Visual Explanations

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Bayesian Optimization Library for Medical Image Segmentation.

General neural ODE and DAE modules for power system dynamic modeling.

Implements the training, testing and editing tools for "Pluralistic Image Completion"

A simple log parser and summariser for IIS web server logs

Realtime micro-expression recognition using OpenCV and PyTorch

Scalable Multi-Agent Reinforcement Learning

A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

A copy of Ares that costs 30 fucking dollars.

A blender add-on that automatically re-aligns wrong axis objects.

A plug-and-play library for neural networks written in Python

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

SelfRemaster: SSL Speech Restoration