MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Overview

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

1

Overview

We release the code of the MVFNet (Multi-View Fusion Network). The core code to implement the Multi-View Fusion Module is codes/models/modules/MVF.py.

[Mar 24, 2021] We has released the code of MVFNet.

[Dec 20, 2020] MVFNet has been accepted by AAAI 2021.

Prerequisites

All dependencies can be installed using pip:

python -m pip install -r requirements.txt

Our experiments run on Python 3.7 and PyTorch 1.5. Other versions should work but are not tested.

Download Pretrained Models

  • Download ImageNet pre-trained models
cd pretrained
sh download_imgnet.sh
  • Download K400 pre-trained models

Please refer to Model Zoo.

Data Preparation

Please refer to DATASETS.md for data preparation.

Model Zoo

Architecture Dataset T x interval Top-1 Acc. Pre-trained model Train log Test log
MVFNet-ResNet50 Kinetics-400 4x16 74.2% Download link Log link Log link
MVFNet-ResNet50 Kinetics-400 8x8 76.0% Download link Miss Log link
MVFNet-ResNet50 Kinetics-400 16x4 77.0% Download link Log link Log link
MVFNet-ResNet101 Kinetics-400 4x16 76.0% Download link Log link Log link
MVFNet-ResNet101 Kinetics-400 8x8 77.4% Download link Log link Log link
MVFNet-ResNet101 Kinetics-400 16x4 78.4% Download link Log link Log link

Testing

  • For 3 crops, 10 clips, the processing of testing
# Dataset: Kinetics-400
# Architecture: R50_8x8 [email protected]=76.0%
bash scripts/dist_test_recognizer.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py ckpt_path 8 --fcn_testing

Training

This implementation supports multi-gpu, DistributedDataParallel training, which is faster and simpler.

  • For example, to train MVFNet-ResNet50 on Kinetics400 with 8 gpus, you can run:
bash scripts/dist_train_recognizer.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8
  • We also provide the script to train MVFNet on Kinetics400 with multiple machines (e.g., 2 machines and 16 GPUs).
# For first machine, --master_addr is the ip of your first machine
bash scripts/dist_train_multinode_1.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8
# For second machine, --master_addr is still the ip of your first machine
bash scripts/dist_train_multinode_2.sh configs/MVFNet/K400/mvf_kinetics400_2d_rgb_r50_dense.py 8

Acknowledgements

We especially thank the contributors of the mmaction codebase for providing helpful code.

License

This repository is released under the Apache-2.0. license as found in the LICENSE file.

Citation

If you think our work is useful, please feel free to cite our paper 😆 :

@inproceedings{wu2020MVFNet,
  author    = {Wu, Wenhao and He, Dongliang and Lin, Tianwei and Li, Fu and Gan, Chuang and Ding, Errui},
  title     = {MVFNet: Multi-View Fusion Network for Efficient Video Recognition},
  booktitle = {AAAI},
  year      = {2021}
}

Contact

For any question, please file an issue or contact

Wenhao Wu: [email protected]
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

MADE (Multi-Adapter Dataset Experts) This repository contains the implementation of MADE (Multi-adapter dataset experts), which is described in the pa

Princeton Natural Language Processing 68 Jul 18, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning Reference Abeßer, J. & Müller, M. Towards Audio Domain Adapt

Jakob Abeßer 2 Jul 06, 2022
An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

MMGEN-FaceStylor English | 简体中文 Introduction This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits

OpenMMLab 182 Dec 27, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
A library for using chemistry in your applications

Chemistry in python Resources Used The following items are not made by me! Click the words to go to the original source Periodic Tab Json - Used in -

Tech Penguin 28 Dec 17, 2021
TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture paperV2 | pretrained models Official PyTorch Implementation Tal Ridnik, Hussam Lawen, Asaf Noy, I

426 Dec 28, 2022
Autonomous Movement from Simultaneous Localization and Mapping

Autonomous Movement from Simultaneous Localization and Mapping About us Built by a group of Clarkson University students with the help from Professor

14 Nov 07, 2022
Code for our CVPR 2022 Paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection"

GEN-VLKT Code for our CVPR 2022 paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection". Contributed by Yue Lia

Yue Liao 47 Dec 04, 2022
2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

TFill arXiv | Project This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity I

Chuanxia Zheng 111 Jan 08, 2023
Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

Alan Patterson 306 Dec 02, 2022
PyTorch implementation of "VRT: A Video Restoration Transformer"

VRT: A Video Restoration Transformer Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool Computer

Jingyun Liang 837 Jan 09, 2023
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

This repository contains the introduction to the collected VRViewportPose dataset and the code for the IEEE INFOCOM 2022 paper: "VR Viewport Pose Model for Quantifying and Exploiting Frame Correlatio

0 Aug 10, 2022
Official source code of Fast Point Transformer, CVPR 2022

Fast Point Transformer Project Page | Paper This repository contains the official source code and data for our paper: Fast Point Transformer Chunghyun

182 Dec 23, 2022
Integrated physics-based and ligand-based modeling.

ComBind ComBind integrates data-driven modeling and physics-based docking for improved binding pose prediction and binding affinity prediction. Given

Dror Lab 44 Oct 26, 2022
PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Involution: Inverting the Inherence of Convolution for Visual Recognition Unofficial PyTorch reimplementation of the paper Involution: Inverting the I

Christoph Reich 100 Dec 01, 2022
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"

yoyo-nb 1.4k Dec 30, 2022