[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

Overview

mmTransformer

Introduction

  • This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented in the commercial project, we provide inference code of model with six trajectory propopals for your reference.

  • For other information, please refer to our paper Multimodal Motion Prediction with Stacked Transformers. (CVPR 2021) [Paper] [Webpage]

img

Set up your virtual environment

  • Initialize virtual environment:

    conda create -n mmTrans python=3.7
    
  • Install agoverse api. Please refer to this page.

  • Install the pytorch. The latest codes are tested on Ubuntu 16.04, CUDA11.1, PyTorch 1.8 and Python 3.7: (Note that we require the version of torch >= 1.5.0 for testing with pretrained model)

    pip install torch==1.8.0+cu111\
          torchvision==0.9.0+cu111\
          torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
    
  • For other requirement, please install with following command:

    pip install -r requirement.txt
    

Preparation

Download the code, model and data

  1. Clone this repo from the GitHub.

     git clone https://github.com/decisionforce/mmTransformer.git
    
  2. Download the pretrained model and data [here] (map.pkl for Python 3.7 is available [here]) and save it to ./models and ./interm_data.

     cd mmTransformer
     mkdir models
     mkdir interm_data
    
  3. Finally, your directory structure should look something like this:

     mmTransformer
     └── models
         └── demo.pt
     └── interm_data
         └── argoverse_info_val.pkl
         └── map.pkl
    

Preprocess the dataset

Alternatively, you can process the data from scratch using following commands.

  1. Download Argoverse dataset and create a symbolic link to ./data folder or use following commands.

     cd path/to/mmtransformer/root
     mkdir data
     cd data
     wget https://s3.amazonaws.com/argoai-argoverse/forecasting_val_v1.1.tar.gz 
     tar -zxvf  forecasting_val_v1.1.tar.gz
    
  2. Then extract the agent and map information from raw data via Argoverse API:

     python -m lib.dataset.argoverse_convertor ./config/demo.py
    
  3. Finally, your directory structure should look something like above illustrated.

Format of processed data in ‘argoverse_info_val.pkl’:

img

Format of map information in ‘map.pkl’:

img

Run the mmTransformer

For testing:

python Evaluation.py ./config/demo.py --model-name demo

Results

Here we showcase the expected results on validation set:

Model Expected results Results in paper
minADE 0.709 0.713
minFDE 1.081 1.153
MR (K=6) 10.2 10.6

TODO

  • We are going to open source our visualization tools and a demo result. (TBD)

Contact us

If you have any issues with the code, please contact to this email: [email protected]

Citation

If you find our work useful for your research, please consider citing the paper

@article{liu2021multimodal,
  title={Multimodal Motion Prediction with Stacked Transformers},
  author={Liu, Yicheng and Zhang, Jinghuai and Fang, Liangji and Jiang, Qinhong and Zhou, Bolei},
  journal={Computer Vision and Pattern Recognition},
  year={2021}
}
Owner
DeciForce: Crossroads of Machine Perception and Autonomy
Research on Unifying Machine Perception and Autonomy in Zhou Group
DeciForce: Crossroads of Machine Perception and Autonomy
PyTorch experiments with the Zalando fashion-mnist dataset

zalando-pytorch PyTorch experiments with the Zalando fashion-mnist dataset Project Organization ├── LICENSE ├── Makefile - Makefile with co

Federico Baldassarre 31 Sep 25, 2021
HINet: Half Instance Normalization Network for Image Restoration

HINet: Half Instance Normalization Network for Image Restoration Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, Chengpeng Chen Paper: https://arxiv.org

303 Dec 31, 2022
Post-training Quantization for Neural Networks with Provable Guarantees

Post-training Quantization for Neural Networks with Provable Guarantees Authors: Jinjie Zhang ( Yixuan Zhou 2 Nov 29, 2022

My solution for the 7th place / 245 in the Umoja Hack 2022 challenge

Umoja Hack 2022 : Insurance Claim Challenge My solution for the 7th place / 245 in the Umoja Hack 2022 challenge Umoja Hack Africa is a yearly hackath

Souames Annis 17 Jun 03, 2022
Combinatorially Hard Games where the levels are procedurally generated

puzzlegen Implementation of two procedurally simulated environments with gym interfaces. IceSlider: the agent needs to reach and stop on the pink squa

Autonomous Learning Group 3 Jun 26, 2022
Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

Dynamic-Graphs-Construction Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Ne

11 Dec 14, 2022
TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline

193 Dec 22, 2022
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

Mario Sessa 8 Dec 12, 2022
Optimizes image files by converting them to webp while also updating all references.

About Optimizes images by (re-)saving them as webp. For every file it replaced it automatically updates all references. Works on single files as well

Watermelon Wolverine 18 Dec 23, 2022
An end-to-end implementation of intent prediction with Metaflow and other cool tools

You Don't Need a Bigger Boat An end-to-end (Metaflow-based) implementation of an intent prediction flow for kids who can't MLOps good and wanna learn

Jacopo Tagliabue 614 Dec 31, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 05, 2023
A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines

A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines Understanding the results of deep neural networks is

Johan van den Heuvel 2 Dec 13, 2021
NeRF Meta-Learning with PyTorch

NeRF Meta Learning With PyTorch nerf-meta is a PyTorch re-implementation of NeRF experiments from the paper "Learned Initializations for Optimizing Co

Sanowar Raihan 78 Dec 18, 2022
なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ

FaceDetection-Anti-Spoof-Demo なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモです。 モデルはPINTO_model_zoo/191_anti-spoof-mn3からONNX形式のモデルを使用しています。 Requirement mediapipe

KazuhitoTakahashi 8 Nov 18, 2022
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
A mini-course offered to Undergrad chemistry students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 19 Dec 19, 2022
Agile SVG maker for python

Agile SVG Maker Need to draw hundreds of frames for a GIF? Need to change the style of all pictures in a PPT? Need to draw similar images with differe

SemiWaker 4 Sep 25, 2022
Disagreement-Regularized Imitation Learning

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in

Kianté Brantley 25 Apr 28, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 96 Jan 03, 2023
The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022