[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

Overview

mmTransformer

Introduction

  • This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented in the commercial project, we provide inference code of model with six trajectory propopals for your reference.

  • For other information, please refer to our paper Multimodal Motion Prediction with Stacked Transformers. (CVPR 2021) [Paper] [Webpage]

img

Set up your virtual environment

  • Initialize virtual environment:

    conda create -n mmTrans python=3.7
    
  • Install agoverse api. Please refer to this page.

  • Install the pytorch. The latest codes are tested on Ubuntu 16.04, CUDA11.1, PyTorch 1.8 and Python 3.7: (Note that we require the version of torch >= 1.5.0 for testing with pretrained model)

    pip install torch==1.8.0+cu111\
          torchvision==0.9.0+cu111\
          torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
    
  • For other requirement, please install with following command:

    pip install -r requirement.txt
    

Preparation

Download the code, model and data

  1. Clone this repo from the GitHub.

     git clone https://github.com/decisionforce/mmTransformer.git
    
  2. Download the pretrained model and data [here] (map.pkl for Python 3.7 is available [here]) and save it to ./models and ./interm_data.

     cd mmTransformer
     mkdir models
     mkdir interm_data
    
  3. Finally, your directory structure should look something like this:

     mmTransformer
     └── models
         └── demo.pt
     └── interm_data
         └── argoverse_info_val.pkl
         └── map.pkl
    

Preprocess the dataset

Alternatively, you can process the data from scratch using following commands.

  1. Download Argoverse dataset and create a symbolic link to ./data folder or use following commands.

     cd path/to/mmtransformer/root
     mkdir data
     cd data
     wget https://s3.amazonaws.com/argoai-argoverse/forecasting_val_v1.1.tar.gz 
     tar -zxvf  forecasting_val_v1.1.tar.gz
    
  2. Then extract the agent and map information from raw data via Argoverse API:

     python -m lib.dataset.argoverse_convertor ./config/demo.py
    
  3. Finally, your directory structure should look something like above illustrated.

Format of processed data in ‘argoverse_info_val.pkl’:

img

Format of map information in ‘map.pkl’:

img

Run the mmTransformer

For testing:

python Evaluation.py ./config/demo.py --model-name demo

Results

Here we showcase the expected results on validation set:

Model Expected results Results in paper
minADE 0.709 0.713
minFDE 1.081 1.153
MR (K=6) 10.2 10.6

TODO

  • We are going to open source our visualization tools and a demo result. (TBD)

Contact us

If you have any issues with the code, please contact to this email: [email protected]

Citation

If you find our work useful for your research, please consider citing the paper

@article{liu2021multimodal,
  title={Multimodal Motion Prediction with Stacked Transformers},
  author={Liu, Yicheng and Zhang, Jinghuai and Fang, Liangji and Jiang, Qinhong and Zhou, Bolei},
  journal={Computer Vision and Pattern Recognition},
  year={2021}
}
Owner
DeciForce: Crossroads of Machine Perception and Autonomy
Research on Unifying Machine Perception and Autonomy in Zhou Group
DeciForce: Crossroads of Machine Perception and Autonomy
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 537 Jan 07, 2023
This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Credit Card Fraud Detection Introduction Online transactions have become a crucial part of any business over the years. Many of those transactions use

Jonathan Hasbani 0 Jan 20, 2022
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua

Kyuhee Jo 5 Nov 24, 2021
Adaptive, interpretable wavelets across domains (NeurIPS 2021)

Adaptive wavelets Wavelets which adapt given data (and optionally a pre-trained model). This yields models which are faster, more compressible, and mo

Yu Group 50 Dec 16, 2022
Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Transformer Based Multi-Source Domain Adaptation Dustin Wright and Isabelle Augenstein To appear in EMNLP 2020. Read the preprint: https://arxiv.org/a

CopeNLU 36 Dec 05, 2022
BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

0 Jan 16, 2022
MNIST, but with Bezier curves instead of pixels

bezier-mnist This is a work-in-progress vector version of the MNIST dataset. Samples Here are some samples from the training set. Note that, while the

Alex Nichol 15 Jan 16, 2022
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Created by Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Zheng Z

Yongming Rao 321 Dec 27, 2022
Neural style transfer as a class in PyTorch

pt-styletransfer Neural style transfer as a class in PyTorch Based on: https://github.com/alexis-jacq/Pytorch-Tutorials Adds: StyleTransferNet as a cl

Tyler Kvochick 31 Jun 27, 2022
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
Extremely simple and fast extreme multi-class and multi-label classifiers.

napkinXC napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various m

Marek Wydmuch 43 Nov 14, 2022
Saeed Lotfi 28 Dec 12, 2022
DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs Abstract: Image-to-image translation has recently achieved re

yaxingwang 23 Apr 14, 2022
Code for Subgraph Federated Learning with Missing Neighbor Generation (NeurIPS 2021)

To run the code Unzip the package to your local directory; Run 'pip install -r requirements.txt' to download required packages; Open file ~/nips_code/

32 Dec 26, 2022
Ensembling Off-the-shelf Models for GAN Training

Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br

MIT HAN Lab 1.2k Dec 26, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
QAT(quantize aware training) for classification with MQBench

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

Seokeon Choi 35 Oct 26, 2022
GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape Completion

GarmentNets This repository contains the source code for the paper GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape

Columbia Artificial Intelligence and Robotics Lab 43 Nov 21, 2022