Video Contrastive Learning with Global Context

Overview

Video Contrastive Learning with Global Context (VCLR)

This is the official PyTorch implementation of our VCLR paper.

Install dependencies

  • environments
    conda create --name vclr python=3.7
    conda activate vclr
    conda install numpy scipy scikit-learn matplotlib scikit-image
    pip install torch==1.7.1 torchvision==0.8.2
    pip install opencv-python tqdm termcolor gcc7 ffmpeg tensorflow==1.15.2
    pip install mmcv-full==1.2.7

Prepare datasets

Please refer to PREPARE_DATA to prepare the datasets.

Prepare pretrained MoCo weights

In this work, we follow SeCo and use the pretrained weights of MoCov2 as initialization.

cd ~
git clone https://github.com/amazon-research/video-contrastive-learning.git
cd video-contrastive-learning
mkdir pretrain && cd pretrain
wget https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_200ep/moco_v2_200ep_pretrain.pth.tar
cd ..

Self-supervised pretraining

bash shell/main_train.sh

Checkpoints will be saved to ./results

Downstream tasks

Linear evaluation

In order to evaluate the effectiveness of self-supervised learning, we conduct a linear evaluation (probing) on Kinetics400 dataset. Basically, we first extract features from the pretrained weight and then train a SVM classifier to see how the learned features perform.

bash shell/eval_svm.sh
  • Results

    Arch Pretrained dataset Epoch Pretrained model Acc. on K400
    ResNet50 Kinetics400 400 Download link 64.1

Video retrieval

bash shell/eval_retrieval.sh

Action recognition & action localization

Here, we use mmaction2 for both tasks. If you are not familiar with mmaction2, you can read the official documentation.

Installation

  • Step1: Install mmaction2

    To make sure the results can be reproduced, please use our forked version of mmaction2 (version: 0.11.0):

    conda activate vclr
    cd ~
    git clone https://github.com/KuangHaofei/mmaction2
    
    cd mmaction2
    pip install -v -e .
  • Step2: Prepare the pretrained weights

    Our pretrained backbone have different format with the backbone of mmaction2, it should be transferred to mmaction2 format. We provide the transferred version of our K400 pretrained weights, TSN and TSM. We also provide the script for transferring weights, you can find it here.

    Moving the pretrained weights to checkpoints directory:

    cd ~/mmaction2
    mkdir checkpoints
    wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm.pth
    wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm_tsm.pth

Action recognition

Make sure you have prepared the dataset and environments following the previous step. Now suppose you are in the root directory of mmaction2, follow the subsequent steps to fine tune the TSN or TSM models for action recognition.

For each dataset, the train and test setting can be found in the configuration files.

  • UCF101

    • config file: tsn_ucf101.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_ucf101.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_ucf101.py \
        work_dirs/vclr/ucf101/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • HMDB51

    • config file: tsn_hmdb51.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_hmdb51.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_hmdb51.py \
        work_dirs/vclr/hmdb51/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • SomethingSomethingV2: TSN

    • config file: tsn_sthv2.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_sthv2.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_sthv2.py \
        work_dirs/vclr/tsn_sthv2/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • SomethingSomethingV2: TSM

    • config file: tsm_sthv2.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsm/vclr/tsm_sthv2.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsm/vclr/tsm_sthv2.py \
        work_dirs/vclr/tsm_sthv2/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • ActivityNet

    • config file: tsn_activitynet.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_activitynet.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_activitynet.py \
        work_dirs/vclr/tsn_activitynet/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • Results

    Arch Dataset Finetuned model Acc.
    TSN UCF101 Download link 85.6
    TSN HMDB51 Download link 54.1
    TSN SomethingSomethingV2 Download link 33.3
    TSM SomethingSomethingV2 Download link 52.0
    TSN ActivityNet Download link 71.9

Action localization

  • Step 1: Follow the previous section, suppose the finetuned model is saved at work_dirs/vclr/tsn_activitynet/latest.pth

  • Step 2: Extract ActivityNet features

    cd ~/mmaction2/tools/data/activitynet/
    
    python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
      --data-list /home/ubuntu/data/ActivityNet/anet_train_video.txt \
      --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
      --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth
    
    python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
      --data-list /home/ubuntu/data/ActivityNet/anet_val_video.txt \
      --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
      --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth
    
    python activitynet_feature_postprocessing.py \
      --rgb /home/ubuntu/data/ActivityNet/rgb_feat \
      --dest /home/ubuntu/data/ActivityNet/mmaction_feat

    Note, the root directory of ActivityNey is /home/ubuntu/data/ActivityNet/ in our case. Please replace it according to your real directory.

  • Step 3: Train and test the BMN model

    • train
      cd ~/mmaction2
      ./tools/dist_train.sh configs/localization/bmn/bmn_acitivitynet_feature_vclr.py 2 \
        --work-dir work_dirs/vclr/bmn_activitynet --validate --seed 0 --deterministic --bmn
    • test
      python tools/test.py configs/localization/bmn/bmn_acitivitynet_feature_vclr.py \
        work_dirs/vclr/bmn_activitynet/latest.pth \
        --bmn --eval [email protected] --out result.json
  • Results

    Arch Dataset Finetuned model AUC [email protected]
    BMN ActivityNet Download link 65.5 73.8

Feature visualization

We provide our feature visualization code at here.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
Single-Shot Motion Completion with Transformer

Single-Shot Motion Completion with Transformer 👉 [Preprint] 👈 Abstract Motion completion is a challenging and long-discussed problem, which is of gr

FuxiCV 78 Dec 29, 2022
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

Phil Wang 208 Dec 25, 2022
This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation (TESKD) By Zheng Li[1,4], Xiang Li[2], Lingfeng Yang[2,4], Jian Yang[2], Zh

Zheng Li 9 Sep 26, 2022
Simple and Distributed Machine Learning

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

An Introduction to Deep Learning for the Physical Layer An usable PyTorch implementation of the noisy autoencoder infrastructure in the paper "An Intr

Gram.AI 120 Nov 21, 2022
A robotic arm that mimics hand movement through MediaPipe tracking.

La-Z-Arm A robotic arm that mimics hand movement through MediaPipe tracking. Hardware NVidia Jetson Nano Sparkfun Pi Servo Shield Micro Servos Webcam

Alfred 1 Jun 05, 2022
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-

John 9 Sep 18, 2022
Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

Machine learning enabling high-throughput and remote operations at large-scale user facilities. Overview This repository contains the source code and

BNL 4 Sep 24, 2022
Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia

71 Dec 14, 2022
Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

1.4k Jan 05, 2023
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
Python package for missing-data imputation with deep learning

MIDASpy Overview MIDASpy is a Python package for multiply imputing missing data using deep learning methods. The MIDASpy algorithm offers significant

MIDASverse 77 Dec 03, 2022
Self-supervised Product Quantization for Deep Unsupervised Image Retrieval - ICCV2021

Self-supervised Product Quantization for Deep Unsupervised Image Retrieval Pytorch implementation of SPQ Accepted to ICCV 2021 - paper Young Kyun Jang

Young Kyun Jang 71 Dec 27, 2022
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022
Video-Music Transformer

VMT Video-Music Transformer (VMT) is an attention-based multi-modal model, which generates piano music for a given video. Paper https://arxiv.org/abs/

Chin-Tung Lin 5 Jul 13, 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022