Official implementation of MSR-GCN (ICCV 2021 paper)

Overview

MSR-GCN

Official implementation of MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction (ICCV 2021 paper)

[Paper] [Supp] [Poster] [Slides]

Authors

  1. Lingwei Dang, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  3. Chengjiang Long, JD Finance America Corporation, USA, [email protected]
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, [email protected]
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, [email protected]

Overview

    Human motion prediction is a challenging task due to the stochasticity and aperiodicity of future poses. Recently, graph convolutional network (GCN) has been proven to be very effective to learn dynamic relations among pose joints, which is helpful for pose prediction. On the other hand, one can abstract a human pose recursively to obtain a set of poses at multiple scales. With the increase of the abstraction level, the motion of the pose becomes more stable, which benefits pose prediction too. In this paper, we propose a novel multi-scale residual Graph Convolution Network (MSR-GCN) for human pose prediction task in the manner of end-to-end. The GCNs are used to extract features from fine to coarse scale and then from coarse to fine scale. The extracted features at each scale are then combined and decoded to obtain the residuals between the input and target poses. Intermediate supervisions are imposed on all the predicted poses, which enforces the network to learn more representative features. Our proposed approach is evaluated on two standard benchmark datasets, i.e., the Human3.6M dataset and the CMU Mocap dataset. Experimental results demonstrate that our method outperforms the state-of-the-art approaches.

Dependencies

  • Pytorch 1.7.0+cu110
  • Python 3.8.5
  • Nvidia RTX 3090

Get the data

Human3.6m in exponential map can be downloaded from here.

CMU mocap was obtained from the repo of ConvSeq2Seq paper.

About datasets

Human3.6M

  • A pose in h3.6m has 32 joints, from which we choose 22, and build the multi-scale by 22 -> 12 -> 7 -> 4 dividing manner.
  • We use S5 / S11 as test / valid dataset, and the rest as train dataset, testing is done on the 15 actions separately, on each we use all data instead of the randomly selected 8 samples.
  • Some joints of the origin 32 have the same position
  • The input / output length is 10 / 25

CMU Mocap dataset

  • A pose in cmu has 38 joints, from which we choose 25, and build the multi-scale by 25 -> 12 -> 7 -> 4 dividing manner.
  • CMU does not have valid dataset, testing is done on the 8 actions separately, on each we use all data instead of the random selected 8 samples.
  • Some joints of the origin 38 have the same position
  • The input / output length is 10 / 25

Train

  • train on Human3.6M:

    python main.py --exp_name=h36m --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

  • train on CMU Mocap:

    python main.py --exp_name=cmu --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

Evaluate and visualize results

  • evaluate on Human3.6M:

    python main.py --exp_name=h36m --is_load=1 --model_path=ckpt/pretrained/h36m_in10out25dctn35_best_err57.9256.pth --output_n=25 --dct_n=35 --test_manner=all

  • evaluate on CMU Mocap:

    python main.py --exp_name=cmu --is_load=1 --model_path=ckpt/pretrained/cmu_in10out25dctn35_best_err37.2310.pth --output_n=25 --dct_n=35 --test_manner=all

Results

H3.6M-10/25/35-all 80 160 320 400 560 1000 -
walking 12.16 22.65 38.65 45.24 52.72 63.05 -
eating 8.39 17.05 33.03 40.44 52.54 77.11 -
smoking 8.02 16.27 31.32 38.15 49.45 71.64 -
discussion 11.98 26.76 57.08 69.74 88.59 117.59 -
directions 8.61 19.65 43.28 53.82 71.18 100.59 -
greeting 16.48 36.95 77.32 93.38 116.24 147.23 -
phoning 10.10 20.74 41.51 51.26 68.28 104.36 -
posing 12.79 29.38 66.95 85.01 116.26 174.33 -
purchases 14.75 32.39 66.13 79.63 101.63 139.15 -
sitting 10.53 21.99 46.26 57.80 78.19 120.02 -
sittingdown 16.10 31.63 62.45 76.84 102.83 155.45 -
takingphoto 9.89 21.01 44.56 56.30 77.94 121.87 -
waiting 10.68 23.06 48.25 59.23 76.33 106.25 -
walkingdog 20.65 42.88 80.35 93.31 111.87 148.21 -
walkingtogether 10.56 20.92 37.40 43.85 52.93 65.91 -
Average 12.11 25.56 51.64 62.93 81.13 114.18 57.93

CMU-10/25/35-all 80 160 320 400 560 1000 -
basketball 10.24 18.64 36.94 45.96 61.12 86.24 -
basketball_signal 3.04 5.62 12.49 16.60 25.43 49.99 -
directing_traffic 6.13 12.60 29.37 39.22 60.46 114.56 -
jumping 15.19 28.85 55.97 69.11 92.38 126.16 -
running 13.17 20.91 29.88 33.37 38.26 43.62 -
soccer 10.92 19.40 37.41 47.00 65.25 101.85 -
walking 6.38 10.25 16.88 20.05 25.48 36.78 -
washwindow 5.41 10.93 24.51 31.79 45.13 70.16 -
Average 8.81 15.90 30.43 37.89 51.69 78.67 37.23

Train

  • train on Human3.6M: python main.py --expname=h36m --is_train=1 --output_n=25 --dct_n=35 --test_manner=all
  • train on CMU Mocap: python main.py --expname=cmu --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

Citation

If you use our code, please cite our work

@InProceedings{Dang_2021_ICCV,
    author    = {Dang, Lingwei and Nie, Yongwei and Long, Chengjiang and Zhang, Qing and Li, Guiqing},
    title     = {MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {11467-11476}
}

Acknowledgments

Some of our evaluation code and data process code was adapted/ported from LearnTrajDep by Wei Mao.

Licence

MIT

Owner
LevonDang
Pursuing the M.E. degree with the School of Computer Science and Engineering, South China University of Technology, 2020-.
LevonDang
LSTM-VAE Implementation and Relevant Evaluations

LSTM-VAE Implementation and Relevant Evaluations Before using any file in this repository, please create two directories under the root directory name

Lan Zhang 5 Oct 08, 2022
TUPÃ was developed to analyze electric field properties in molecular simulations

TUPÃ: Electric field analyses for molecular simulations What is TUPÃ? TUPÃ (pronounced as tu-pan) is a python algorithm that employs MDAnalysis engine

Marcelo D. Polêto 10 Jul 17, 2022
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 05, 2023
Extremely easy multi instancing software for minecraft speedrunning.

Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi

Duncan 8 Jul 16, 2022
Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

Amol 8 Sep 05, 2022
This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

Akari Asai 59 Dec 28, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 68 Dec 28, 2022
Project dự đoán giá cổ phiếu bằng thuật toán LSTM gồm: code train và code demo

Web predicts stock prices using Long - Short Term Memory algorithm Give me some start please!!! User interface image: Choose: DayBegin, DayEnd, Stock

Vo Thuong Truong Nhon 8 Nov 11, 2022
The Environment I built to study Reinforcement Learning + Pokemon Showdown

pokemon-showdown-rl-environment The Environment I built to study Reinforcement Learning + Pokemon Showdown Been a while since I ran this. Think it is

3 Jan 16, 2022
GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration Stefan Abi-Karam*, Yuqi He*, Rishov Sarkar*, Lakshmi Sathidevi, Zihang Qiao, Co

Sharc-Lab 19 Dec 15, 2022
zeus is a Python implementation of the Ensemble Slice Sampling method.

zeus is a Python implementation of the Ensemble Slice Sampling method. Fast & Robust Bayesian Inference, Efficient Markov Chain Monte Carlo (MCMC), Bl

Minas Karamanis 197 Dec 04, 2022
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect vi

MUGEN 11 Oct 22, 2022
This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.

Semantic Segmentation on PyTorch (include FCN, PSPNet, Deeplabv3, Deeplabv3+, DANet, DenseASPP, BiSeNet, EncNet, DUNet, ICNet, ENet, OCNet, CCNet, PSANet, CGNet, ESPNet, LEDNet, DFANet)

2.4k Jan 08, 2023
Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

DONGJUN LEE 82 Oct 22, 2022
Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

Sample-specific Bayesian Networks A framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample or per-patient re

Caleb Ellington 1 Sep 23, 2022
pix2pix in tensorflow.js

pix2pix in tensorflow.js This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite See a live demo here: https://yining1023.github

Yining Shi 47 Oct 04, 2022
pytorch implementation of trDesign

trdesign-pytorch This repository is a PyTorch implementation of the trDesign paper based on the official TensorFlow implementation. The initial port o

Learn Ventures Inc. 41 Dec 29, 2022
[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

InvCompress Official Pytorch Implementation for "Enhanced Invertible Encoding for Learned Image Compression", ACMMM 2021 (Oral) Figure: Our framework

96 Nov 30, 2022
Hand gesture recognition model that can be used as a remote control for a smart tv.

Gesture_recognition The training data consists of a few hundred videos categorised into one of the five classes. Each video (typically 2-3 seconds lon

Pratyush Negi 1 Aug 11, 2022
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

Christian Reiser 373 Dec 20, 2022