Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

Last update: Dec 06, 2022

Related tags

Deep Learning MonoFlex

Overview

MonoFlex

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21.

Work in progress.

Installation

This repo is tested with Ubuntu 20.04, python==3.7, pytorch==1.4.0 and cuda==10.1

conda create -n monoflex python=3.7

conda activate monoflex

Install PyTorch and other dependencies:

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

pip install -r requirements.txt

Build DCNv2 and the project

cd models/backbone/DCNv2

. make.sh

cd ../../..

python setup develop

Data Preparation

Please download KITTI dataset and organize the data as follows:

#ROOT		
  |training/
    |calib/
    |image_2/
    |label/
    |ImageSets/
  |testing/
    |calib/
    |image_2/
    |ImageSets/

Then modify the paths in config/paths_catalog.py according to your data path.

Training & Evaluation

Training with one GPU. (TODO: The multi-GPU training will be further tested.)

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --batch_size 8 --config runs/monoflex.yaml --output output/exp

The model will be evaluated periodically (can be adjusted in the CONFIG) during training and you can also evaluate a checkpoint with

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --config runs/monoflex.yaml --ckpt YOUR_CKPT  --eval

You can also specify --vis when evaluation to visualize the predicted heatmap and 3D bounding boxes. The pretrained model for train/val split and logs are here.

Note: we observe an obvious variation of the performance for different runs and we are still investigating possible solutions to stablize the results, though it may inevitably due to the utilized uncertainties.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{MonoFlex,
    author    = {Zhang, Yunpeng and Lu, Jiwen and Zhou, Jie},
    title     = {Objects Are Different: Flexible Monocular 3D Object Detection},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3289-3298}
}

Acknowlegment

The code is heavily borrowed from SMOKE and thanks for their contribution.

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

Related tags

Overview

MonoFlex

Installation

Data Preparation

Training & Evaluation

Citation

Acknowlegment

Owner

Yunpeng

A program that can analyze videos according to the weights you select

Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

official implementation for the paper "Simplifying Graph Convolutional Networks"

Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

FFTNet vocoder implementation

Simulation of the solar system using various nummerical methods

Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

PyTorch wrappers for using your model in audacity!

An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

Anderson Acceleration for Deep Learning

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Human-Pose-and-Motion History