Object DGCNN & DETR3D

This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110.06922). Our implementations are built on top of MMdetection3D.

Prerequisite

mmcv (https://github.com/open-mmlab/mmcv)
mmdet (https://github.com/open-mmlab/mmdetection)
mmseg (https://github.com/open-mmlab/mmsegmentation)
mmdet3d (https://github.com/open-mmlab/mmdetection3d)

Data

Follow the mmdet3d to process the data.

Train

Downloads the pretrained backbone weights to pretrained/
For example, to train Object-DGCNN with pillar on 8 GPUs, please use

tools/dist_train.sh projects/configs/obj_dgcnn/pillar.py 8

Evaluation using pretrained models

Download the weights accordingly.

Backbone	mAP	NDS	Download
DETR3D, ResNet101 w/ DCN	34.7	42.2	model \| log
above, + CBGS	34.9	43.4	model \| log
DETR3D, VoVNet on trainval, evaluation on test set	41.2	47.9	model \| log

Backbone	mAP	NDS	Download
Object DGCNN, pillar	53.2	62.8	model \| log
Object DGCNN, voxel	58.6	66.0	model \| log

To test, use
tools/dist_test.sh projects/configs/obj_dgcnn/pillar_cosine.py /path/to/ckpt 8 --eval=bbox

If you find this repo useful for your research, please consider citing the papers

@inproceedings{
   obj-dgcnn,
   title={Object DGCNN: 3D Object Detection using Dynamic Graphs},
   author={Wang, Yue and Solomon, Justin M.},
   booktitle={2021 Conference on Neural Information Processing Systems ({NeurIPS})},
   year={2021}
}

@inproceedings{
   detr3d,
   title={DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries},
   author={Wang, Yue and Guizilini, Vitor and Zhang, Tianyuan and Wang, Yilun and Zhao, Hang and and Solomon, Justin M.},
   booktitle={The Conference on Robot Learning ({CoRL})},
   year={2021}
}

Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

Related tags

Overview

Object DGCNN & DETR3D

Prerequisite

Data

Train

Evaluation using pretrained models

Owner

Wang, Yue

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Image-Stitching - Panorama composition using SIFT Features and a custom implementaion of RANSAC algorithm

A Python library for generating new text from existing samples.

The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Towards uncontrained hand-object reconstruction from RGB videos

Train the HRNet model on ImageNet

A curated list and survey of awesome Vision Transformers.

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Instance-Dependent Partial Label Learning

Data cleaning, missing value handle, EDA use in this project

Pixray is an image generation system

Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Unsupervised Pre-training for Person Re-identification (LUPerson)

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.