Self-Regulated Learning for Egocentric Video Activity Anticipation

Last update: Sep 23, 2022

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

This is a Pytorch implementation of the model described in our paper:

Z. Qi, S. Wang, C. Su, L. Su, Q. Huang, and Q. Tian. Self-Regulated Learning for Egocentric Video Activity Anticipation. TPAMI 2021.

Dependencies

Pytorch >= 1.0.1
Cuda 9.0.176
Cudnn 7.4.2
Python 3.6.8

Data

EPIC-Kitchens dataset

For the raw data of the EPIC-Kitchens dataset, please refer to https://github.com/epic-kitchens/download-scripts to download.

For the three modality features (rgb, flow, obj), please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

EGTEA Gaze+ dataset

For the raw data of the EGTEA Gaze+ dataset, please refer to http://cbs.ic.gatech.edu/fpv/ to download.

For the extracted features, please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

50 Salads dataset

For the raw data of the 50 Salads dataset, please refer to http://cvip.computing.dundee.ac.uk/datasets/foodpreparation/50salads/ to download.

For the extracted features, please refer to https://github.com/colincsl/TemporalConvolutionalNetworks to download. After downloading, put them in the folder './data'.

Breakfast dataset

For the raw data of the Breakfast dataset, please refer to https://serre-lab.clps.brown.edu/resource/breakfast-actions-dataset/ to download.

For the extraced I3D features, please download from Baidu passward: 'wub3' or Google Drive. After downloading, put them in the folder './data'.

Train for Epic-Kitchen dataset

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --wd 1e-5 --lr 0.1 --reinforce_verb_weight 0.01 --reinforce_noun_weight 0.01 --revision_weight 0.8 --mode train --modality rgb --hidden 1024 --feat_in 1024

Silimar commonds can be used for flow or obj features.

Validation for Epic-Kitchen dataset

Please download the pre-trained model weigths from Baidu passward: 'wub3' or Google Drive, and put them in the folder './results/EPIC/base_srl/pre_trained/'.

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality rgb --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For flow feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality flow --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For obj feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality obj --hidden 352 --feat_in 352 --resume_timestamp pre_trained

For three modality features, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality fusion --resume_timestamp pre_trained

Citation

Please cite our paper if you use this code in your own work:

@article{qi2021self,
  title={Self-Regulated Learning for Egocentric Video Activity Anticipation},
  author={Qi, Zhaobo and Wang, Shuhui and Su, Chi and Su, Li and Huang, Qingming and Tian, Qi},
  journal={IEEE Transactions on Pattern Analysis \& Machine Intelligence},
  number={01},
  pages={1--1},
  year={2021},
  publisher={IEEE Computer Society}
}

Concat

If you have any problem about our code, feel free to contact

[email protected]

Self-Regulated Learning for Egocentric Video Activity Anticipation

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

Dependencies

Data

EPIC-Kitchens dataset

EGTEA Gaze+ dataset

50 Salads dataset

Breakfast dataset

Train for Epic-Kitchen dataset

Validation for Epic-Kitchen dataset

Citation

Concat

Owner

qzhb

CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

Naszilla is a Python library for neural architecture search (NAS)

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

PyTorch implementation of Pointnet2/Pointnet++

[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Official implementation of "Learning Not to Reconstruct" (BMVC 2021)

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Repository for GNSS-based position estimation using a Deep Neural Network

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

Inference pipeline for our participation in the FeTA challenge 2021.

It is an open dataset for object detection in remote sensing images.

Code repository for EMNLP 2021 paper 'Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods'

Malmo Collaborative AI Challenge - Team Pig Catcher