Pytorch implementation for the Temporal and Object Quantification Networks (TOQ-Nets).

Overview

TOQ-Nets-PyTorch-Release

Pytorch implementation for the Temporal and Object Quantification Networks (TOQ-Nets).

TOQ-Nets

Temporal and Object Quantification Networks
Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie Pack Kaelbling, and Tomer D. Ullman
In International Joint Conference on Artificial Intelligence (IJCAI) 2021 (Poster)
[Paper] [Project Page] [BibTex]

@inproceedings{Mao2021Temporal,
    title={{Temporal and Object Quantification Networks}},
    author={Mao, Jiayuan and Luo, Zhezheng and Gan, Chuang and Tenenbaum, Joshua B. and Wu, Jiajun and Kaelbling, Leslie Pack and Ullman, Tomer D.},
    booktitle={International Joint Conferences on Artificial Intelligence},
    year={2021}
}

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required python packages specified by requirements.txt. See the Installation.

Installation

Install Jacinle: Clone the package, and add the bin path to your global PATH environment variable:

git clone https://github.com/vacancy/Jacinle --recursive
export PATH=<path_to_jacinle>/bin:$PATH

Clone this repository:

git clone https://github.com/vacancy/TOQ-Nets-PyTorch --recursive

Create a conda environment for TOQ-Nets, and install the requirements. This includes the required python packages from both Jacinle TOQ-Nets. Most of the required packages have been included in the built-in anaconda package:

conda create -n nscl anaconda
conda install pytorch torchvision -c pytorch

Dataset preparation

We evaluate our model on four datasets: Soccer Event, RLBench, Toyota Smarthome and Volleyball. To run the experiments, you need to prepare them under NSPCL-Pytorch/data.

Soccer Event

Download link

RLBenck

Download link

Toyota Smarthome

Dataset can be obtained from the website: Toyota Smarthome: Real-World Activities of Daily Living

@InProceedings{Das_2019_ICCV,
    author = {Das, Srijan and Dai, Rui and Koperski, Michal and Minciullo, Luca and Garattoni, Lorenzo and Bremond, Francois and Francesca, Gianpiero},
    title = {Toyota Smarthome: Real-World Activities of Daily Living},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month = {October},
    year = {2019}
}

Volleyball

Dataset can be downloaded from this github repo.

@inproceedings{msibrahiCVPR16deepactivity,
  author    = {Mostafa S. Ibrahim and Srikanth Muralidharan and Zhiwei Deng and Arash Vahdat and Greg Mori},
  title     = {A Hierarchical Deep Temporal Model for Group Activity Recognition.},
  booktitle = {2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2016}
}

Training and evaluation.

Standard 9-way classification task

To train the model on the standard 9-way classification task on the soccer dataset:

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1001 --run_name 9_way_classification -Mmodel-name "'NLTL_SAv3'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128 -Mhp-train-estimate_inequality_parameters "(1,1)" -Mmodel-both_quantify False -Mmodel-depth 0

The hyper parameter estimate_inequality_parameters is to estimate the distribution of input physical features, and is only required when training TOQ-Nets (but not for baselines).

Few-shot actions

To train on regular actions and test on new actions:

jac-crun <gpu_ids> scripts/action_classification_softmax.py  -t 1002 --run_name few_shot -Mdata-name "'TrajectorySingleActionNvN_Wrapper_FewShot_Softmax'" -Mmodel-name "'NLTL_SAv3'" -Mlr 3e-3 -Mn_epochs 200 -Mbatch_size 128 -Mdata-new_actions "[('interfere', (50, 50, 2000)), ('sliding', (50, 50, 2000))]" -Mhp-train-finetune_period "(1,200)" -Mhp-train-estimate_inequality_parameters "(1,1)"

You can set the split of few-shot actions using -Mdata-new_actions, and the tuple (50, 50, 2000) represents the number of samples available in training validation and testing.

Generalization to more of fewer players and temporally warped trajectories.

To test the generalization to more or fewer players, as well as temporal warpped trajectories, first train the model on the standard 6v6 games:

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1003 --run_name generalization -Mmodel-name "'NLTL_SAv3'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mn_epochs 200 -Mbatch_size 128 -Mhp-train-estimate_inequality_parameters "(1,1)" -Mlr 3e-3

Then to generalize to games with 11 players:

jac-crun 3 scripts/action_classification_softmax.py -t 1003 --run_name generalization_more_players --eval 200 -Mdata-name "'LongVideoNvN'" -Mdata-n_train 0.1 -Mdata-temporal "'exact'" -Mdata-n_players 11

The number 200 after --eval should be equal to the number of epochs of training. Note that 11 can be replace by any number of players from [3,4,6,8,11].

Similarly, to generalize to temporally warped trajectoryes:

jac-crun 3 scripts/action_classification_softmax.py -t 1003 --run_name generalization_time_warp --eval 200 -Mdata-name "'LongVideoNvN'" -Mdata-n_train 0.1 -Mdata-temporal "'all'" -Mdata-n_players 6

Baselines

We also provide the example commands for training all baselines:

STGCN

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1004 --run_name stgcn -Mmodel-name "'STGCN_SA'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mmodel-n_agents 13 -Mn_epochs 200 -Mbatch_size 128

STGCN-LSTM

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1005 --run_name stgcn_lstm -Mmodel-name "'STGCN_LSTM_SA'" -Mdata-name "'LongVideoNvN'" -Mdata-n_players 6 -Mmodel-n_agents 13 -Mn_epochs 200 -Mbatch_size 128

Space-Time Region Graph

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1006 --run_name strg -Mmodel-name "'STRG_SA'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128

Non-Local

jac-crun <gpu_ids> scripts/action_classification_softmax.py -t 1007 --run_name non_local -Mmodel-name "'NONLOCAL_SA'" -Mdata-name "'LongVideoNvN'" -Mn_epochs 200 -Mbatch_size 128
Owner
Zhezheng Luo
Zhezheng Luo
Learning to Draw: Emergent Communication through Sketching

Learning to Draw: Emergent Communication through Sketching This is the official code for the paper "Learning to Draw: Emergent Communication through S

19 Jul 22, 2022
MQBench Quantization Aware Training with PyTorch

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
Code release for Universal Domain Adaptation(CVPR 2019)

Universal Domain Adaptation Code release for Universal Domain Adaptation(CVPR 2019) Requirements python 3.6+ PyTorch 1.0 pip install -r requirements.t

THUML @ Tsinghua University 229 Dec 23, 2022
KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

KAPAO (Keypoints and Poses as Objects) KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as

Will McNally 664 Dec 30, 2022
Implementation of a Transformer using ReLA (Rectified Linear Attention)

ReLA (Rectified Linear Attention) Transformer Implementation of a Transformer using ReLA (Rectified Linear Attention). It will also contain an attempt

Phil Wang 49 Oct 14, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
SMPL-X: A new joint 3D model of the human body, face and hands together

SMPL-X: A new joint 3D model of the human body, face and hands together [Paper Page] [Paper] [Supp. Mat.] Table of Contents License Description News I

Vassilis Choutas 1k Jan 09, 2023
Instance-wise Feature Importance in Time (FIT)

Instance-wise Feature Importance in Time (FIT) FIT is a framework for explaining time series perdiction models, by assigning feature importance to eve

Sana 46 Dec 25, 2022
🔊 Audio and fastai v2

Fastaudio An audio module for fastai v2. We want to help you build audio machine learning applications while minimizing the need for audio domain expe

152 Dec 28, 2022
🔪 Elimination based Lightweight Neural Net with Pretrained Weights

ELimNet ELimNet: Eliminating Layers in a Neural Network Pretrained with Large Dataset for Downstream Task Removed top layers from pretrained Efficient

snoop2head 4 Jul 12, 2022
A module that used for encrypt code which includes RSA and AES

软件加密模块 requirement: Crypto,pycryptodome,pyqt5 本地加密信息为随机字符串 使用说明 命令行参数 -h 帮助 -checkWorking 检查是否能正常工作,后接1确认指令 -checkEndDate 检查截至日期,后接1确认指令 -activateCode

2 Sep 27, 2022
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Knover Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out eff

607 Dec 31, 2022
fklearn: Functional Machine Learning

fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th

nubank 1.4k Dec 07, 2022
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende

Sanjana Gunna 7 Aug 07, 2022
An example of Scatterbrain implementation (combining local attention and Performer)

An example of Scatterbrain implementation (combining local attention and Performer)

HazyResearch 97 Jan 02, 2023
Text to Image Generation with Semantic-Spatial Aware GAN

text2image This repository includes the implementation for Text to Image Generation with Semantic-Spatial Aware GAN This repo is not completely. Netwo

CVDDL 124 Dec 30, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning 🧩 Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022