Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Overview

Pytorch Code for VideoLT

[Website][Paper]

Updates

  • [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at fudan.edu.cn
  • [09/28/2021] Features uploaded to Aliyun Drive(deprecated), for access please send us an e-mail: zhangxing18 at fudan.edu.cn
  • [08/23/2021] Checkpoint links uploaded, sorry we are handling campus network bandwidth limitation, dataset will be released in this weeek.
  • [08/15/2021] Code released. Dataset download links and checkpoints links will be updated in a week.
  • [07/29/2021] Dataset released, visit https://videolt.github.io/ for downloading.
  • [07/23/2021] VideoLT is accepted by ICCV2021.

concept

Overview

VideoLT is a large-scale long-tailed video recognition dataset, as a step toward real-world video recognition. We provide VideoLT dataset and long-tailed baselines in this repo including:

Data Preparation

Please visit https://videolt.github.io/ to obtain download links. We provide raw videos and extracted features.

For using extracted features, please modify dataset/dutils.py and set the correct path to features.

Model Zoo

The baseline scripts and checkpoints are provided in MODELZOO.md.

FrameStack

FrameStack is simple yet effective approach for long-tailed video recognition which re-samples training data at the frame level and adopts a dynamic sampling strategy based on knowledge learned by the network. The rationale behind FrameStack is to dynamically sample more frames from videos in tail classes and use fewer frames for those from head classes.

framestack

Usage

Requirement

pip install -r requirements.txt

Prepare Data Path

  1. Modify FEATURE_NAME, PATH_TO_FEATURE and FEATURE_DIM in dataset/dutils.py.

  2. Set ROOT in dataset/dutils.py to labels folder. The directory structure is:

    labels
    |-- count-labels-train.lst
    |-- test.lst
    |-- test_videofolder.txt
    |-- train.lst
    |-- train_videofolder.txt
    |-- val_videofolder.txt
    `-- validate.lst

Train

We provide scripts for training. Please refer to MODELZOO.md.

Example training scripts:

FEATURE_NAME='ResNet101'

export CUDA_VISIBLE_DEVICES='2'
python base_main.py  \
     --augment "mixup" \
     --feature_name $FEATURE_NAME \
     --lr 0.0001 \
     --gd 20 --lr_steps 30 60 --epochs 100 \
     --batch-size 128 -j 16 \
     --eval-freq 5 \
     --print-freq 20 \
     --root_log=$FEATURE_NAME-log \
     --root_model=$FEATURE_NAME'-checkpoints' \
     --store_name=$FEATURE_NAME'_bs128_lr0.0001_lateavg_mixup' \
     --num_class=1004 \
     --model_name=NonlinearClassifier \
     --train_num_frames=60 \
     --val_num_frames=150 \
     --loss_func=BCELoss \

Note: Set args.resample, args.augment and args.loss_func can apply multiple long-tailed stratigies.

Options:

    args.resample: ['None', 'CBS','SRS']
    args.augment : ['None', 'mixup', 'FrameStack']
    args.loss_func: ['BCELoss', 'LDAM', 'EQL', 'CBLoss', 'FocalLoss']

Test

We provide scripts for testing in scripts. Modify CKPT to saved checkpoints.

Example testing scripts:

FEATURE_NAME='ResNet101'
CKPT='VideoLT_checkpoints/ResNet-101/ResNet101_bs128_lr0.0001_lateavg_mixup/ckpt.best.pth.tar'

export CUDA_VISIBLE_DEVICES='1'
python base_test.py \
     --resume $CKPT \
     --feature_name $FEATURE_NAME \
     --batch-size 128 -j 16 \
     --print-freq 20 \
     --num_class=1004 \
     --model_name=NonlinearClassifier \
     --train_num_frames=60 \
     --val_num_frames=150 \
     --loss_func=BCELoss \

Citing

If you find VideoLT helpful for your research, please consider citing:

@misc{zhang2021videolt,
title={VideoLT: Large-scale Long-tailed Video Recognition}, 
author={Xing Zhang and Zuxuan Wu and Zejia Weng and Huazhu Fu and Jingjing Chen and Yu-Gang Jiang and Larry Davis},
year={2021},
eprint={2105.02668},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Owner
Skye
Soul Programmer & Science Enthusiast
Skye
Neural HMMs are all you need (for high-quality attention-free TTS)

Neural HMMs are all you need (for high-quality attention-free TTS) Shivam Mehta, Éva Székely, Jonas Beskow, and Gustav Eje Henter This is the official

Shivam Mehta 0 Oct 28, 2022
The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021

Directed Graph Contrastive Learning Paper | Poster | Supplementary The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL). In this

Tong Zekun 28 Jan 08, 2023
Tree LSTM implementation in PyTorch

Tree-Structured Long Short-Term Memory Networks This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representati

Riddhiman Dasgupta 529 Dec 10, 2022
Character-Input - Create a program that asks the user to enter their name and their age

Character-Input Create a program that asks the user to enter their name and thei

PyLaboratory 0 Feb 06, 2022
Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Perceiver This Python package implements Perceiver: General Perception with Iterative Attention by Andrew Jaegle in TensorFlow. This model builds on t

Rishit Dagli 84 Oct 15, 2022
A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

How to apply? Create your config.ini file following the example provided in config.ini Choose one of the options below to run: Run with Python3 pip in

Scientific Data Management Group 3 Jun 23, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Ubisoft 76 Dec 30, 2022
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
Python project to take sound as input and output as RGB + Brightness values suitable for DMX

sound-to-light Python project to take sound as input and output as RGB + Brightness values suitable for DMX Current goals: Get one pixel working: Vary

Bobby Cox 1 Nov 17, 2021
This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization This is the code for our paper ``SumGNN: Multi-typed Drug

Yue Yu 58 Dec 21, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into

2 Sep 16, 2022
Controlling Hill Climb Racing with Hand Tacking

Controlling Hill Climb Racing with Hand Tacking Opened Palm for Gas Closed Palm for Brake

Rohit Ingole 3 Jan 18, 2022
Garbage Detection system which will detect objects based on whether it is plastic waste or plastics or just garbage.

Garbage Detection using Yolov5 on Jetson Nano 2gb Developer Kit. Garbage detection system which will detect objects based on whether it is plastic was

Rishikesh A. Bondade 2 May 13, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
abess: Fast Best-Subset Selection in Python and R

abess: Fast Best-Subset Selection in Python and R Overview abess (Adaptive BEst Subset Selection) library aims to solve general best subset selection,

297 Dec 21, 2022
Starter Code for VALUE benchmark

StarterCode for VALUE Benchmark This is the starter code for VALUE Benchmark [website], [paper]. This repository currently supports all baseline model

VALUE Benchmark 73 Dec 09, 2022
Notepy is a full-featured Notepad Python app

Notepy A full featured python text-editor Notable features Autocompletion for parenthesis and quote Auto identation Syntax highlighting Compile and ru

Mirko Rovere 11 Sep 28, 2022
Dataset para entrenamiento de yoloV3 para 4 clases

Deteccion de objetos en video Este repo basado en el proyecto PyTorch YOLOv3 para correr detección de objetos sobre video. Construí sobre este proyect

1 Nov 01, 2021
Joint Detection and Identification Feature Learning for Person Search

Person Search Project This repository hosts the code for our paper Joint Detection and Identification Feature Learning for Person Search. The code is

712 Dec 17, 2022