Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection, CVPR 2021

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

[project], [supp], [slides], [1min_talk], demo, arxiv

This code is based on Kinematic-3D, such that the setup/organization is very similar. A few of the implementations, such as classical NMS, are based on Caffe.

References

Please cite the following paper if you find this repository useful:

@inproceedings{kumar2021groomed,
  title={{GrooMeD-NMS}: Grouped Mathematically Differentiable NMS for Monocular {$3$D} Object Detection},
  author={Kumar, Abhinav and Brazil, Garrick and Liu, Xiaoming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Setup

  • Requirements

    1. Python 3.6
    2. Pytorch 0.4.1
    3. Torchvision 0.2.1
    4. Cuda 8.0
    5. Ubuntu 18.04/Debian 8.9

    This is tested with NVIDIA 1080 Ti GPU. Other platforms have not been tested. Unless otherwise stated, the below scripts and instructions assume the working directory is the project root.

    Clone the repo first:

    git clone https://github.com/abhi1kumar/groomed_nms.git
  • Cuda & Python

    Install some basic packages:

    sudo apt-get install libopenblas-dev libboost-dev libboost-all-dev git
    sudo apt install gfortran
    
    # We need to compile with older version of gcc and g++
    sudo apt install gcc-5 g++-5
    sudo ln -f /usr/bin/gcc-5 /usr/local/cuda-8.0/bin/gcc
    sudo ln -s /usr/bin/g++-5 /usr/local/cuda-8.0/bin/g++

    Next, install conda and then install the required packages:

    wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
    bash Anaconda3-2020.02-Linux-x86_64.sh
    source ~/.bashrc
    conda list
    conda create --name py36 --file dependencies/conda.txt
    conda activate py36
  • KITTI Data

    Download the following images of the full KITTI 3D Object detection dataset:

    Then place a soft-link (or the actual data) in data/kitti:

     ln -s /path/to/kitti data/kitti

    The directory structure should look like this:

    ./groomed_nms
    |--- cuda_env
    |--- data
    |      |---kitti
    |            |---training
    |            |        |---calib
    |            |        |---image_2
    |            |        |---label_2
    |            |
    |            |---testing
    |                     |---calib
    |                     |---image_2
    |
    |--- dependencies
    |--- lib
    |--- models
    |--- scripts

    Then, use the following scripts to extract the data splits, which use soft-links to the above directory for efficient storage:

    python data/kitti_split1/setup_split.py
    python data/kitti_split2/setup_split.py

    Next, build the KITTI devkit eval:

     sh data/kitti_split1/devkit/cpp/build.sh
  • Classical NMS

    Lastly, build the classical NMS modules:

    cd lib/nms
    make
    cd ../..

Training

Training is carried out in two stages - a warmup and a full. Review the configurations in scripts/config for details.

chmod +x scripts_training.sh
./scripts_training.sh

If your training is accidentally stopped, you can resume at a checkpoint based on the snapshot with the restore flag. For example, to resume training starting at iteration 10k, use the following command:

source dependencies/cuda_8.0_env
CUDA_VISIBLE_DEVICES=0 python -u scripts/train_rpn_3d.py --config=groumd_nms --restore=10000

Testing

We provide logs/models/predictions for the main experiments on KITTI Val 1/Val 2/Test data splits available to download here.

Make an output folder in the project directory:

mkdir output

Place different models in the output folder as follows:

./groomed_nms
|--- output
|      |---groumd_nms
|      |
|      |---groumd_nms_split2
|      |
|      |---groumd_nms_full_train_2
|
| ...

To test, run the file as below:

chmod +x scripts_evaluation.sh
./scripts_evaluation.sh

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

You might also like...
Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

Code for
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

Code for
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

GCoNet The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection . Trained model Download final_gconet.pth

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

Categorical Depth Distribution Network for Monocular 3D Object Detection
Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Progressive Coordinate Transforms for Monocular 3D Object Detection
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

Comments
  • Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    Hi~thanks for your great work. However, I have some confusion in understanding the motivation of this algorithm. If we want to achieve the consistency of training and test, we can simply penalize the highest-confidence proposal in the training pipeline, which seems to achieve similar result.So, is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    opened by kaixinbear 3
  • Problem in test

    Problem in test

    Hi, this is an exciting work.And i have a question when I try to test with the pre-train model. I can't find "Kinematic3D-Release/val1_kinematic/model_final".

    opened by chenH20000109 1
Releases(v0.1)
Owner
Abhinav Kumar
PhD Student, Computer Vision and Deep Learning, MSU
Abhinav Kumar
Official implementation for TTT++: When Does Self-supervised Test-time Training Fail or Thrive

TTT++ This is an official implementation for TTT++: When Does Self-supervised Test-time Training Fail or Thrive? TL;DR: Online Feature Alignment + Str

VITA lab at EPFL 39 Dec 25, 2022
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Price-Prediction-For-a-Dream-Home ROADMAP TO THIS LINEAR REGRESSION BASED HOUSE PRICE PREDICTION PREDICTION MODEL Import all the dependencies of the p

DIKSHA DESWAL 1 Dec 29, 2021
💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena.

Heidelberg-NLP 17 Nov 07, 2022
This repo contains the source code and a benchmark for predicting user's utilities with Machine Learning techniques for Computational Persuasion

Machine Learning for Argument-Based Computational Persuasion This repo contains the source code and a benchmark for predicting user's utilities with M

Ivan Donadello 4 Nov 07, 2022
A video scene detection algorithm is designed to detect a variety of different scenes within a video

Scene-Change-Detection - A video scene detection algorithm is designed to detect a variety of different scenes within a video. There is a very simple definition for a scene: It is a series of logical

1 Jan 04, 2022
Code for LIGA-Stereo Detector, ICCV'21

LIGA-Stereo Introduction This is the official implementation of the paper LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based

Xiaoyang Guo 75 Dec 09, 2022
Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

MSVCL_MICCAI2021 Installation Please follow the instruction in pytorch-CycleGAN-and-pix2pix to install. Example Usage An example of vendor-styles tran

Jaron Lee 11 Oct 19, 2022
Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

TWIST: Self-Supervised Learning by Estimating Twin Class Distributions Codes and pretrained models for TWIST: @article{wang2021self, title={Self-Sup

Bytedance Inc. 85 Dec 15, 2022
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 04, 2022
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

CoTr: Efficient 3D Medical Image Segmentation by bridging CNN and Transformer This is the official pytorch implementation of the CoTr: Paper: CoTr: Ef

218 Dec 25, 2022
A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag. Features Connects and logins to a remote CTFd instance. Dumps all

Podalirius 77 Dec 07, 2022
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

AutoViz and Auto_ViML 102 Dec 17, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

Max Woolf 81 Jan 04, 2023
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection (NimPme) The official implementation of Novel Instances Mining with

12 Sep 08, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN in PyTorch Official implementation of StyleCariGAN:Caricature Generation via StyleGAN Feature Map Modulation in PyTorch Requirements PyTo

PeterZhouSZ 49 Oct 31, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

bottom-up-attention This code implements a bottom-up attention model, based on multi-gpu training of Faster R-CNN with ResNet-101, using object and at

Peter Anderson 1.3k Jan 09, 2023
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

Coursera-deep-learning-specialization - Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks an

Aman Chadha 1.7k Jan 08, 2023