[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Overview

OW-DETR: Open-world Detection Transformer (CVPR 2022)

[Paper]

Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

( 🌟 denotes equal contribution)

Introduction

Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from $1.8%$ to $3.3%$ in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC.


Installation

Requirements

We have trained and tested our models on Ubuntu 16.0, CUDA 10.2, GCC 5.4, Python 3.7

conda create -n owdetr python=3.7 pip
conda activate owdetr
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Dataset & Results

OWOD proposed splits



The splits are present inside data/VOC2007/OWOD/ImageSets/ folder. The remaining dataset can be downloaded using this link

The files should be organized in the following structure:

OW-DETR/
└── data/
    └── VOC2007/
        └── OWOD/
        	β”œβ”€β”€ JPEGImages
        	β”œβ”€β”€ ImageSets
        	└── Annotations

Results

Task1 Task2 Task3 Task4
Method U-Recall mAP U-Recall mAP U-Recall mAP mAP
ORE-EBUI 4.9 56.0 2.9 39.4 3.9 29.7 25.3
OW-DETR 7.5 59.2 6.2 42.9 5.7 30.8 27.8

Our proposed splits



The splits are present inside data/VOC2007/OWDETR/ImageSets/ folder. The remaining dataset can be downloaded using this link

The files should be organized in the following structure:

OW-DETR/
└── data/
    └── VOC2007/
        └── OWDETR/
        	β”œβ”€β”€ JPEGImages
        	β”œβ”€β”€ ImageSets
        	└── Annotations

Currently, Dataloader and Evaluator followed for OW-DETR is in VOC format.

Results

Task1 Task2 Task3 Task4
Method U-Recall mAP U-Recall mAP U-Recall mAP mAP
ORE-EBUI 1.5 61.4 3.9 40.6 3.6 33.7 31.8
OW-DETR 5.7 71.5 6.2 43.8 6.9 38.5 33.1

Training

Training on single node

To train OW-DETR on a single node with 8 GPUS, run

./run.sh

Training on slurm cluster

To train OW-DETR on a slurm cluster having 2 nodes with 8 GPUS each, run

sbatch run_slurm.sh

Evaluation

For reproducing any of the above mentioned results please run the run_eval.sh file and add pretrained weights accordingly.

Note: For more training and evaluation details please check the Deformable DETR reposistory.

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Citation

If you use OW-DETR, please consider citing:

@inproceedings{gupta2021ow,
    title={OW-DETR: Open-world Detection Transformer}, 
    author={Gupta, Akshita and Narayan, Sanath and Joseph, KJ and 
    Khan, Salman and Khan, Fahad Shahbaz and Shah, Mubarak},
    booktitle={CVPR},
    year={2022}
}

Contact

Should you have any question, please contact πŸ“§ [email protected]

Acknowledgments:

OW-DETR builds on previous works code base such as Deformable DETR, Detreg, and OWOD. If you found OW-DETR useful please consider citing these works as well.

Owner
Akshita Gupta
Sem @IITR | Outreachy @mozilla | Research Engineer @IIAI
Akshita Gupta
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Differential Privacy (DP) Based Federated Learning (FL) Everything about DP-based FL you need is here. οΌˆζ‰€ζœ‰δ½ ιœ€θ¦ηš„DP-based FLηš„δΏ‘ζ―ιƒ½εœ¨θΏ™ι‡ŒοΌ‰ Code Tip: the code o

wenzhu 83 Dec 24, 2022
The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

Jake YANG 62 Nov 21, 2022
Facilitates implementing deep neural-network backbones, data augmentations

Introduction Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common wa

40 Dec 29, 2022
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
NBEATSx: Neural basis expansion analysis with exogenous variables

NBEATSx: Neural basis expansion analysis with exogenous variables We extend the NBEATS model to incorporate exogenous factors. The resulting method, c

Cristian Challu 100 Dec 31, 2022
a project for 3D multi-object tracking

a project for 3D multi-object tracking

155 Jan 04, 2023
Madanalysis5 - A package for event file analysis and recasting of LHC results

Welcome to MadAnalysis 5 Outline What is MadAnalysis 5? Requirements Downloading

MadAnalysis 15 Jan 01, 2023
Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data

JumpDetectR Name of QuantLet : JumpDetectR Published in : 'To be published as "Jump dynamics in high frequency crypto markets"' Description : 'Scala

LvB 12 Jan 01, 2023
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
Code release for "BoxeR: Box-Attention for 2D and 3D Transformers"

BoxeR By Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees Snoek. This repository is an official implementation of the paper BoxeR: Box-A

Nguyen Duy Kien 111 Dec 07, 2022
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
Count the MACs / FLOPs of your PyTorch model.

THOP: PyTorch-OpCounter How to install pip install thop (now continously intergrated on Github actions) OR pip install --upgrade git+https://github.co

Ligeng Zhu 3.9k Dec 29, 2022
Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Pre-trained image classification models for Jax/Haiku Jax/Haiku Applications are deep learning models that are made available alongside pre-trained we

Alper Baris CELIK 14 Dec 20, 2022
Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Pano3D A Holistic Benchmark and a Solid Baseline for 360o Depth Estimation Pano3D is a new benchmark for depth estimation from spherical panoramas. We

Visual Computing Lab, Information Technologies Institute, Centre for Reseach and Technology Hellas 50 Dec 29, 2022
AI Toolkit for Healthcare Imaging

Medical Open Network for AI MONAI is a PyTorch-based, open-source framework for deep learning in healthcare imaging, part of PyTorch Ecosystem. Its am

Project MONAI 3.7k Jan 07, 2023
Official Implementation of "Learning Disentangled Behavior Embeddings"

DBE: Disentangled-Behavior-Embedding Official implementation of Learning Disentangled Behavior Embeddings (NeurIPS 2021). Environment requirement The

Mishne Lab 12 Sep 28, 2022
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

klein 125 Jan 03, 2023
RepVGG: Making VGG-style ConvNets Great Again

RepVGG: Making VGG-style ConvNets Great Again (PyTorch) This is a super simple ConvNet architecture that achieves over 80% top-1 accuracy on ImageNet

2.8k Jan 04, 2023