Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Related tags

Deep LearningPGCN
Overview

Graph Convolutional Networks for Temporal Action Localization

This repo holds the codes and models for the PGCN framework presented on ICCV 2019

Graph Convolutional Networks for Temporal Action Localization Runhao Zeng*, Wenbing Huang*, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan, ICCV 2019, Seoul, Korea.

[Paper]

Updates

20/12/2019 We have uploaded the RGB features, trained models and evaluation results! We found that increasing the number of proposals to 800 in the testing further boosts the performance on THUMOS14. We have also updated the proposal list.

04/07/2020 We have uploaded the I3D features on Anet, the training configurations files in data/dataset_cfg.yaml and the proposal lists for Anet.

Contents



Usage Guide

Prerequisites

[back to top]

The training and testing in PGCN is reimplemented in PyTorch for the ease of use.

Other minor Python modules can be installed by running

pip install -r requirements.txt

Code and Data Preparation

[back to top]

Get the code

Clone this repo with git, please remember to use --recursive

git clone --recursive https://github.com/Alvin-Zeng/PGCN

Download Datasets

We support experimenting with two publicly available datasets for temporal action detection: THUMOS14 & ActivityNet v1.3. Here are some steps to download these two datasets.

  • THUMOS14: We need the validation videos for training and testing videos for testing. You can download them from the THUMOS14 challenge website.
  • ActivityNet v1.3: this dataset is provided in the form of YouTube URL list. You can use the official ActivityNet downloader to download videos from the YouTube.

Download Features

Here, we provide the I3D features (RGB+Flow) for training and testing.

THUMOS14: You can download it from Google Cloud or Baidu Cloud.

Anet: You can download the I3D Flow features from Baidu Cloud (password: jbsa) and the I3D RGB features from Google Cloud (Note: set the interval to 16 in ops/I3D_Pooling_Anet.py when training with RGB features)

Download Proposal Lists (ActivityNet)

Here, we provide the proposal lists for ActivityNet 1.3. You can download them from Google Cloud

Training PGCN

[back to top]

Plesse first set the path of features in data/dataset_cfg.yaml

train_ft_path: $PATH_OF_TRAINING_FEATURES
test_ft_path: $PATH_OF_TESTING_FEATURES

Then, you can use the following commands to train PGCN

python pgcn_train.py thumos14 --snapshot_pre $PATH_TO_SAVE_MODEL

After training, there will be a checkpoint file whose name contains the information about dataset and the number of epoch. This checkpoint file contains the trained model weights and can be used for testing.

Testing Trained Models

[back to top]

You can obtain the detection scores by running

sh test.sh TRAINING_CHECKPOINT

Here, TRAINING_CHECKPOINT denotes for the trained model. This script will report the detection performance in terms of mean average precision at different IoU thresholds.

The trained models and evaluation results are put in the "results" folder.

You can obtain the two-stream results on THUMOS14 by running

sh test_two_stream.sh

THUMOS14

[email protected] (%) RGB Flow RGB+Flow
P-GCN (I3D) 37.23 47.42 49.07 (49.64)

#####Here, 49.64% is obtained by setting the combination weights to Flow:RGB=1.2:1 and nms threshold to 0.32

Other Info

[back to top]

Citation

Please cite the following paper if you feel PGCN useful to your research

@inproceedings{PGCN2019ICCV,
  author    = {Runhao Zeng and
               Wenbing Huang and
               Mingkui Tan and
               Yu Rong and
               Peilin Zhao and
               Junzhou Huang and
               Chuang Gan},
  title     = {Graph Convolutional Networks for Temporal Action Localization},
  booktitle   = {ICCV},
  year      = {2019},
}

Contact

For any question, please file an issue or contact

Runhao Zeng: [email protected]
Owner
Runhao Zeng
Runhao Zeng
Code for Paper Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning

Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning (c) Tianyu Han and Daniel Truhn, RWTH Aachen University, 20

Tianyu Han 7 Nov 22, 2022
Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

DeepCurrents | Webpage | Paper DeepCurrents: Learning Implicit Representations of Shapes with Boundaries David Palmer*, Dmitriy Smirnov*, Stephanie Wa

Dima Smirnov 36 Dec 08, 2022
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

1 Oct 02, 2021
“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

ccks2021-track3 CCKS2021中文NLP地址相关性任务-赛道三-冠军方案 团队:我的加菲鱼- wodejiafeiyu 初赛第二/复赛第一/决赛第一 前言 19年开始,陆陆续续参加了一些比赛,拿到过一些top,比较懒一直都没分享过,这次比较幸运又拿了top1,打算分享下 分类的任务

shaochenjie 131 Dec 31, 2022
structured-generative-modeling

This repository contains the implementation for the paper Information Theoretic StructuredGenerative Modeling, Specially thanks for the open-source co

0 Oct 11, 2021
OpenMMLab Image Classification Toolbox and Benchmark

Introduction English | 简体中文 MMClassification is an open source image classification toolbox based on PyTorch. It is a part of the OpenMMLab project. D

OpenMMLab 1.8k Jan 03, 2023
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 09, 2023
ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS. It currently supports four examples for you to quickly experience the power of ONNX Runti

Microsoft 58 Dec 18, 2022
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

VidLanKD Implementation of VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer by Zineng Tang, Jaemin Cho, Hao Tan, Mohi

Zineng Tang 54 Dec 20, 2022
Real-time Neural Representation Fusion for Robust Volumetric Mapping

NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric Mapping Paper | Supplementary This repository contains the implementation of

ETHZ ASL 106 Dec 24, 2022
一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

定时面板上的签到盒 一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 特别声明 本仓库发布的脚本及其中涉及的任何解锁和解密分析脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合

Leon 1.1k Dec 30, 2022
rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle.

rastrainer rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle. UI TODO Init UI. Add Block. Add l

deepbands 5 Mar 04, 2022
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

62 Dec 22, 2022
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 08, 2023
MutualGuide is a compact object detector specially designed for embedded devices

Introduction MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two

ZHANG Heng 103 Dec 13, 2022
Image super-resolution through deep learning

srez Image super-resolution through deep learning. This project uses deep learning to upscale 16x16 images by a 4x factor. The resulting 64x64 images

David Garcia 5.3k Dec 28, 2022
ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha

AI2 65 Dec 30, 2022
Keyword spotting on Arm Cortex-M Microcontrollers

Keyword spotting for Microcontrollers This repository consists of the tensorflow models and training scripts used in the paper: Hello Edge: Keyword sp

Arm Software 1k Dec 30, 2022
DeepAL: Deep Active Learning in Python

DeepAL: Deep Active Learning in Python Python implementations of the following active learning algorithms: Random Sampling Least Confidence [1] Margin

Kuan-Hao Huang 583 Jan 03, 2023