Code for CVPR2019 Towards Natural and Accurate Future Motion Prediction of Humans and Animals

Overview

Motion prediction with Hierarchical Motion Recurrent Network

Introduction

This work concerns motion prediction of articulate objects such as human, fish and mice. Given a sequence of historical skeletal joints locations, we model the dynamics of the trajectory as kinematic chains of SE(3) group actions, parametrized by se(3) Lie algebra parameters. A sequence to sequence model employing our novel Hierarchical Motion Recurrent (HMR) Network as the decoder is employed to learn the temporal context of input pose sequences so as to predict future motion.

Instead of adopting the conventional Euclidean L2 loss function for the 3D coordinates, we propose a geodesic regression loss layer on the SE(3) manifold which provides the following advantages.

  • The SE(3) representation respects the anatomical and kinematic constraints of the skeletal model, i.e. bone lengths and physical degrees of freedom at the joints.
  • Spatial relations underlying the joints are fully captured.
  • Subtleties of temporal dynamics are better modelled in the manifold space than Euclidean space due to the absence of redundancy and constraints in the Lie algebra parameterization.

Train and Predict

The main file is found in motion_prediction.py.
To train and predict on default setting, execute the following with python 3.

python motion_prediction.py
FLAGS Default value Possible values Remarks
dataset --dataset Human Human, Fish, Mouse
datatype --datatype lie lie, xyz
action --action all all, actions listed below
training --training=1 0, 1
visualize --visualize=1 0, 1
longterm --longterm=0 0, 1 Super long-term prediction exceeding 60s.
dataset: Human
action: walking, eating or smoking.

To train and predict for different settings, simply set different value for the flags. An example of long term prediction for walking on the Human dataset is given below.

python motion_prediction.py --action walking --longterm=1

Possible actions for Human 3.6m

["directions", "discussion", "eating", "greeting", "phoning",
 "posing", "purchases", "sitting", "sittingdown", "smoking",
 "takingphoto", "waiting", "walking", "walkingdog", "walkingtogether"]

The configuration file is found in training_config.py. There are choices of different LSTM architectures as well as different loss functions that can be chosen in the configuration.

Checkpoint and Output

checkpoints are saved in:

./checkpoint/dataset[Human, Fish, Mouse]/datatype[lie, xyz]_model(_recurrent-steps_context-window_hidden-size)_loss/action/inputWindow_outputWindow

outputs are saved in:

./output/dataset[Human, Fish, Mouse]/datatype[lie, xyz]_model_(_recurrent-steps_context-window_hidden-size)_loss/action/inputWindow_outputWindow

*[ ] denotes possible arguments and ( ) is specific for our HMR model

Discovering and Achieving Goals via World Models

Discovering and Achieving Goals via World Models [Project Website] [Benchmark Code] [Video (2min)] [Oral Talk (13min)] [Paper] Russell Mendonca*1, Ole

Oleg Rybkin 71 Dec 22, 2022
The story of Chicken for Club Bing

Chicken Story tl;dr: The time when Microsoft banned my entire country for cheating at Club Bing. (A lot of the details are from memory so I've recreat

Eyal 142 May 16, 2022
Funnels: Exact maximum likelihood with dimensionality reduction.

Funnels This repository contains the code needed to reproduce the experiments from the paper: Funnels: Exact maximum likelihood with dimensionality re

2 Apr 21, 2022
Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

David Álvarez de la Torre 1 Dec 02, 2022
CPF: Learning a Contact Potential Field to Model the Hand-object Interaction

Contact Potential Field This repo contains model, demo, and test codes of our paper: CPF: Learning a Contact Potential Field to Model the Hand-object

Lixin YANG 99 Dec 26, 2022
FairyTailor: Multimodal Generative Framework for Storytelling

FairyTailor: Multimodal Generative Framework for Storytelling

Eden Bens 172 Dec 30, 2022
CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels

CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels Accurate pressure drop estimat

Alejandro Montanez 0 Jan 21, 2022
Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

Diffusion Probabilistic Models for 3D Point Cloud Generation [Paper] [Code] The official code repository for our CVPR 2021 paper "Diffusion Probabilis

Shitong Luo 323 Jan 05, 2023
GAN-STEM-Conv2MultiSlice - Exploring Generative Adversarial Networks for Image-to-Image Translation in STEM Simulation

GAN-STEM-Conv2MultiSlice GAN method to help covert lower resolution STEM images generated by convolution methods to higher resolution STEM images gene

UW-Madison Computational Materials Group 2 Feb 10, 2021
Predict halo masses from simulations via graph neural networks

HaloGraphNet Predict halo masses from simulations via Graph Neural Networks. Given a dark matter halo and its galaxies, creates a graph with informati

Pablo Villanueva Domingo 20 Nov 15, 2022
joint detection and semantic segmentation, based on ultralytics/yolov5,

Multi YOLO V5——Detection and Semantic Segmentation Overeview This is my undergraduate graduation project which based on ultralytics YOLO V5 tag v5.0.

477 Jan 06, 2023
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

Aleph Alpha GmbH 331 Jan 03, 2023
CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

Italian CLIP 114 Dec 29, 2022
Cooperative Driving Dataset: a dataset for multi-agent driving scenarios

Cooperative Driving Dataset (CODD) The Cooperative Driving dataset is a synthetic dataset generated using CARLA that contains lidar data from multiple

Eduardo Henrique Arnold 124 Dec 28, 2022
Official implement of "CAT: Cross Attention in Vision Transformer".

CAT: Cross Attention in Vision Transformer This is official implement of "CAT: Cross Attention in Vision Transformer". Abstract Since Transformer has

100 Dec 15, 2022
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

OpenAI 2.9k Jan 04, 2023
Accuracy Aligned. Concise Implementation of Swin Transformer

Accuracy Aligned. Concise Implementation of Swin Transformer This repository contains the implementation of Swin Transformer, and the training codes o

FengWang 77 Dec 16, 2022
Just-Now - This Is Just Now Login Friendlist Cloner Tools

JUST NOW LOGIN FRIENDLIST CLONER TOOLS Install $ apt update $ apt upgrade $ apt

MAHADI HASAN AFRIDI 21 Mar 09, 2022
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks.

pyradiomics v3.0.1 Build Status Linux macOS Windows Radiomics feature extraction in Python This is an open-source python package for the extraction of

Artificial Intelligence in Medicine (AIM) Program 842 Dec 28, 2022