CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Overview

Temporal Context Aggregation Network - Pytorch

This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal Action Proposal Refinement", which is accepted in CVPR 2021.

[Arxiv Preprint]

Update

  • 2021.07.02: Update proposals, checkpoints, features for TCANet!
  • 2021.05.31: Repository for TCANet

Contents

Paper Introduction

image

Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. The proposals generated by current methods still suffer from inaccurate temporal boundaries and inferior confidence used for retrieval owing to the lack of efficient temporal modeling and effective boundary context utilization. In this paper, we propose Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context aggregation and complementary as well as progressive boundary refinement. Specifically, we first design a Local-Global Temporal Encoder (LGTE), which adopts the channel grouping strategy to efficiently encode both "local and global" temporal inter-dependencies. Furthermore, both the boundary and internal context of proposals are adopted for frame-level and segment-level boundary regressions, respectively. Temporal Boundary Regressor (TBR) is designed to combine these two regression granularities in an end-to-end fashion, which achieves the precise boundaries and reliable confidence of proposals through progressive refinement. Extensive experiments are conducted on three challenging datasets: HACS, ActivityNet-v1.3, and THUMOS-14, where TCANet can generate proposals with high precision and recall. By combining with the existing action classifier, TCANet can obtain remarkable temporal action detection performance compared with other methods. Not surprisingly, the proposed TCANet won the 1st place in the CVPR 2020 - HACS challenge leaderboard on temporal action localization task.

Prerequisites

These code is implemented in Pytorch 1.5.1 + Python3.

Code and Data Preparation

Get the code

Clone this repo with git, please use:

git clone https://github.com/qingzhiwu/Temporal-Context-Aggregation-Network-Pytorch.git

Download Datasets

We support experiments with publicly available dataset HACS for temporal action proposal generation now. To download this dataset, please use official HACS downloader to download videos from the YouTube.

To extract visual feature, we adopt Slowfast model pretrained on the training set of HACS. Please refer this repo Slowfast to extract features.

For convenience of training and testing, we provide the rescaled feature at here Google Cloud or Baidu Yun[Code:x3ve].

In Baidu Yun Link, we provide:

-- features/: SlowFast features for training, validation and testing.
-- checkpoint/: Pre-trained TCANet model for SlowFast features provided by us.
-- proposals/: BMN proposals processed by us.
-- classification/: The best classification results we used in paper and 2020 HACS challenge.

Training and Testing of TCANet

All configurations of TCANet are saved in opts.py, where you can modify training and model parameter.

1. Unzip Proposals

tar -jxvf hacs.bmn.pem.slowfast101.t200.wd1e-5.warmup.pem_input_100.tar.bz2 -C ./
tar -jxvf hacs.bmn.pem.slowfast101.t200.wd1e-5.warmup.pem_input.tar.bz2 -C ./

2. Unzip Features

# for training features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.bz2 -C .

# for validation features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.bz2 -C .

# for testing features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.bz2 -C .

4. Training of TCANet

python3 main_tcanet.py --mode train \
--checkpoint_path ./checkpoint/ \
--video_anno /path/to/HACS_segments_v1.1.1.json \
--feature_path /path/to/feature/ \
--train_proposals_path /path/to/pem_input_100/in/proposals \ 
--test_proposals_path /path/to/pem_input/in/proposals 

We also provide trained TCANet model in ./checkpoint in our BaiduYun Link.

6. Testing of TCANet

# We split the dataset into 4 parts, and inference these parts on 4 gpus
python3 main_tcanet.py  --mode inference --part_idx 0 --gpu 0 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 1 --gpu 1 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 2 --gpu 2 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 3 --gpu 3 --classifier_result /path/to/classifier/{}94.32.json

7. Post processing and generate final results

python3 main_tcanet.py  --mode inference --part_idx -1

Other Info

Citation

Please cite the following paper if you feel TCANet useful to your research

@inproceedings{qing2021temporal,
  title={Temporal Context Aggregation Network for Temporal Action Proposal Refinement},
  author={Qing, Zhiwu and Su, Haisheng and Gan, Weihao and Wang, Dongliang and Wu, Wei and Wang, Xiang and Qiao, Yu and Yan, Junjie and Gao, Changxin and Sang, Nong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={485--494},
  year={2021}
}

Contact

For any question, please file an issue or contact

Zhiwu Qing: [email protected]
Owner
Zhiwu Qing
Zhiwu Qing
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

100 Sep 28, 2022
NR-GAN: Noise Robust Generative Adversarial Networks

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

Takuhiro Kaneko 59 Dec 11, 2022
A PaddlePaddle version image model zoo.

Paddle-Image-Models English | 简体中文 A PaddlePaddle version image model zoo. Install Package Install by pip: $ pip install ppim Install by wheel package

AgentMaker 131 Dec 07, 2022
PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

Mitch Hill 32 Nov 25, 2022
Layered Neural Atlases for Consistent Video Editing

Layered Neural Atlases for Consistent Video Editing Project Page | Paper This repository contains an implementation for the SIGGRAPH Asia 2021 paper L

Yoni Kasten 353 Dec 27, 2022
Speech-Emotion-Analyzer - The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Speech Emotion Analyzer The idea behind creating this project was to build a machine learning model that could detect emotions from the speech we have

Mitesh Puthran 965 Dec 24, 2022
Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

Transferable Unrestricted Adversarial Examples This is the PyTorch implementation of the Arxiv paper: Towards Transferable Unrestricted Adversarial Ex

equation 16 Dec 29, 2022
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
Multi-Stage Progressive Image Restoration

Multi-Stage Progressive Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Sh

Syed Waqas Zamir 859 Dec 22, 2022
Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

DeepMind 29 Dec 29, 2022
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

0 Jan 23, 2022
M3DSSD: Monocular 3D Single Stage Object Detector

M3DSSD: Monocular 3D Single Stage Object Detector Setup pytorch 0.4.1 Preparation Download the full KITTI detection dataset. Then place a softlink (or

mumianyuxin 64 Dec 27, 2022
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Lei Shu, Ehsan Shareghi, and Nig

Yixuan Su 79 Nov 04, 2022
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

Jia Research Lab 115 Dec 23, 2022
Computationally efficient algorithm that identifies boundary points of a point cloud.

BoundaryTest Included are MATLAB and Python packages, each of which implement efficient algorithms for boundary detection and normal vector estimation

6 Dec 09, 2022
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
The pure and clear PyTorch Distributed Training Framework.

The pure and clear PyTorch Distributed Training Framework. Introduction Requirements and Usage Dependency Dataset Basic Usage Slurm Cluster Usage Base

WILL LEE 208 Dec 20, 2022
Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

Yixin Liu 150 Dec 12, 2022
TICC is a python solver for efficiently segmenting and clustering a multivariate time series

TICC TICC is a python solver for efficiently segmenting and clustering a multivariate time series. It takes as input a T-by-n data matrix, a regulariz

406 Dec 12, 2022