CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Overview

Temporal Context Aggregation Network - Pytorch

This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal Action Proposal Refinement", which is accepted in CVPR 2021.

[Arxiv Preprint]

Update

  • 2021.07.02: Update proposals, checkpoints, features for TCANet!
  • 2021.05.31: Repository for TCANet

Contents

Paper Introduction

image

Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. The proposals generated by current methods still suffer from inaccurate temporal boundaries and inferior confidence used for retrieval owing to the lack of efficient temporal modeling and effective boundary context utilization. In this paper, we propose Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context aggregation and complementary as well as progressive boundary refinement. Specifically, we first design a Local-Global Temporal Encoder (LGTE), which adopts the channel grouping strategy to efficiently encode both "local and global" temporal inter-dependencies. Furthermore, both the boundary and internal context of proposals are adopted for frame-level and segment-level boundary regressions, respectively. Temporal Boundary Regressor (TBR) is designed to combine these two regression granularities in an end-to-end fashion, which achieves the precise boundaries and reliable confidence of proposals through progressive refinement. Extensive experiments are conducted on three challenging datasets: HACS, ActivityNet-v1.3, and THUMOS-14, where TCANet can generate proposals with high precision and recall. By combining with the existing action classifier, TCANet can obtain remarkable temporal action detection performance compared with other methods. Not surprisingly, the proposed TCANet won the 1st place in the CVPR 2020 - HACS challenge leaderboard on temporal action localization task.

Prerequisites

These code is implemented in Pytorch 1.5.1 + Python3.

Code and Data Preparation

Get the code

Clone this repo with git, please use:

git clone https://github.com/qingzhiwu/Temporal-Context-Aggregation-Network-Pytorch.git

Download Datasets

We support experiments with publicly available dataset HACS for temporal action proposal generation now. To download this dataset, please use official HACS downloader to download videos from the YouTube.

To extract visual feature, we adopt Slowfast model pretrained on the training set of HACS. Please refer this repo Slowfast to extract features.

For convenience of training and testing, we provide the rescaled feature at here Google Cloud or Baidu Yun[Code:x3ve].

In Baidu Yun Link, we provide:

-- features/: SlowFast features for training, validation and testing.
-- checkpoint/: Pre-trained TCANet model for SlowFast features provided by us.
-- proposals/: BMN proposals processed by us.
-- classification/: The best classification results we used in paper and 2020 HACS challenge.

Training and Testing of TCANet

All configurations of TCANet are saved in opts.py, where you can modify training and model parameter.

1. Unzip Proposals

tar -jxvf hacs.bmn.pem.slowfast101.t200.wd1e-5.warmup.pem_input_100.tar.bz2 -C ./
tar -jxvf hacs.bmn.pem.slowfast101.t200.wd1e-5.warmup.pem_input.tar.bz2 -C ./

2. Unzip Features

# for training features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.training.tar.bz2 -C .

# for validation features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.validation.tar.bz2 -C .

# for testing features
cd features/
cat slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.bz2.*>slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.gz
tar -zxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.gz
tar -jxvf slowfast101.epoch9.87.52.finetune.pool.t.keep.t.s8.testing.tar.bz2 -C .

4. Training of TCANet

python3 main_tcanet.py --mode train \
--checkpoint_path ./checkpoint/ \
--video_anno /path/to/HACS_segments_v1.1.1.json \
--feature_path /path/to/feature/ \
--train_proposals_path /path/to/pem_input_100/in/proposals \ 
--test_proposals_path /path/to/pem_input/in/proposals 

We also provide trained TCANet model in ./checkpoint in our BaiduYun Link.

6. Testing of TCANet

# We split the dataset into 4 parts, and inference these parts on 4 gpus
python3 main_tcanet.py  --mode inference --part_idx 0 --gpu 0 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 1 --gpu 1 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 2 --gpu 2 --classifier_result /path/to/classifier/{}94.32.json
python3 main_tcanet.py  --mode inference --part_idx 3 --gpu 3 --classifier_result /path/to/classifier/{}94.32.json

7. Post processing and generate final results

python3 main_tcanet.py  --mode inference --part_idx -1

Other Info

Citation

Please cite the following paper if you feel TCANet useful to your research

@inproceedings{qing2021temporal,
  title={Temporal Context Aggregation Network for Temporal Action Proposal Refinement},
  author={Qing, Zhiwu and Su, Haisheng and Gan, Weihao and Wang, Dongliang and Wu, Wei and Wang, Xiang and Qiao, Yu and Yan, Junjie and Gao, Changxin and Sang, Nong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={485--494},
  year={2021}
}

Contact

For any question, please file an issue or contact

Zhiwu Qing: [email protected]
Owner
Zhiwu Qing
Zhiwu Qing
Mail classification with tensorflow and MS Exchange Server (ham or spam).

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Metin Karatas 1 Sep 11, 2021
Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

Alireza Kiaeipour 7 Jul 10, 2022
TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022) Ziang Cao and Ziyuan Huang and Liang Pan and Shiwei Zhang and Ziwei Liu and Changhong Fu In

Intelligent Vision for Robotics in Complex Environment 100 Dec 19, 2022
Data cleaning, missing value handle, EDA use in this project

Lending Club Case Study Project Brief Solving this assignment will give you an idea about how real business problems are solved using EDA. In this cas

Dhruvil Sheth 1 Jan 05, 2022
Train CNNs for the fruits360 data set in NTOU CS「Machine Vision」class.

CNNs fruits360 Train CNNs for the fruits360 data set in NTOU CS「Machine Vision」class. CNN on a pretrained model Build a CNN on a pretrained model, Res

Ricky Chuang 1 Mar 07, 2022
Fake News Detection Using Machine Learning Methods

Fake-News-Detection-Using-Machine-Learning-Methods Fake news is always a real and dangerous issue. However, with the presence and abundance of various

Achraf Safsafi 1 Jan 11, 2022
Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads Inc 2 Jan 03, 2022
✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

Wang Qingyu 6 Nov 04, 2022
A PyTorch implementation of Implicit Q-Learning

IQL-PyTorch This repository houses a minimal PyTorch implementation of Implicit Q-Learning (IQL), an offline reinforcement learning algorithm, along w

Garrett Thomas 30 Dec 12, 2022
toroidal - a lightweight transformer library for PyTorch

toroidal - a lightweight transformer library for PyTorch Toroidal transformers are of smaller size and lower weight than the more common E-I types. Th

MathInf GmbH 64 Jan 07, 2023
ServiceX Transformer that converts flat ROOT ntuples into columnwise data

ServiceX_Uproot_Transformer ServiceX Transformer that converts flat ROOT ntuples into columnwise data Usage You can invoke the transformer from the co

Vis 0 Jan 20, 2022
An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

DTech.AIML An AI made using artificial intelligence (AI) and machine learning algorithms (ML) . This is created by help of some members in my team and

1 Jan 06, 2022
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors   In order to facilitate the res

yujmo 11 Dec 12, 2022
Keras Model Implementation Walkthrough

Keras Model Implementation Walkthrough

Luke Wood 17 Sep 27, 2022
i3DMM: Deep Implicit 3D Morphable Model of Human Heads

i3DMM: Deep Implicit 3D Morphable Model of Human Heads CVPR 2021 (Oral) Arxiv | Poject Page This project is the official implementation our work, i3DM

Tarun Yenamandra 60 Jan 03, 2023
Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

LMMNN Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks This is the working dire

Giora Simchoni 10 Nov 02, 2022
This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

ICCV Workshop 2021 VTGAN This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

Sharif Amit Kamran 25 Dec 08, 2022
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
Long Expressive Memory (LEM)

Long Expressive Memory for Sequence Modeling This repository contains the implementation to reproduce the numerical experiments of the paper Long Expr

Konstantin Rusch 47 Dec 17, 2022
A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

来自丹麦的天籁 10 Dec 06, 2022