Pytorch implementation of the unsupervised object discovery method LOST.

Related tags

Deep LearningLOST
Overview

LOST

Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper:

Localizing Objects with Self-Supervised Transformers and no Labels [arXiv]
by Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet and Jean Ponce

LOST visualizations LOST visualizations


If you use the LOST code or framework in your research, please consider citing:

@article{LOST,
   title = {Localizing Objects with Self-Supervised Transformers and no Labels},
   author = {Oriane Sim\'eoni and Gilles Puy and Huy V. Vo and Simon Roburin and Spyros Gidaris and Andrei Bursuc and Patrick P\'erez and Renaud Marlet and Jean Ponce},
   journal = {arXiv preprint arXiv:2109.14279},
   month = {09},
   year = {2021}
}

Installation

Dependencies

This code was implemented with python 3.7, PyTorch 1.7.1 and CUDA 10.2. Please install PyTorch. In order to install the additionnal dependencies, please launch the following command:

pip install -r requirements.txt

Install DINO

This method is based on DINO paper. The framework can be installed using the following commands:

> __init__.py; cd ../; ">
git clone https://github.com/facebookresearch/dino.git
cd dino; 
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;

The code was made using the commit ba9edd1 of DINO repo (please rebase if breakage).

Apply LOST to one image

Following are scripts to apply LOST to an image defined via the image_path parameter and visualize the predictions (pred), the maps of the Figure 2 in the paper (fms) and the visulization of the seed expansion (seed_expansion). Box predictions are also stored in the output directory given by parameter output_dir.

python main_lost.py --image_path examples/VOC07_000236.jpg --visualize pred
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize fms
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize seed_expansion

Launching on datasets

Following are the different steps to reproduce the results of LOST presented in the paper.

PASCAL-VOC

Please download the PASCAL VOC07 and PASCAL VOC12 datasets (link) and put the data in the folder datasets. There should be the two subfolders: datasets/VOC2007 and datasets/VOC2012. In order to apply lost and compute corloc results (VOC07 61.9, VOC12 64.0), please launch:

python main_lost.py --dataset VOC07 --set trainval
python main_lost.py --dataset VOC12 --set trainval

COCO

Please download the COCO dataset and put the data in datasets/COCO. Results are provided given the 2014 annotations following previous works. The following command line allows you to get results on the subset of 20k images of the COCO dataset (corloc 50.7), following previous litterature. To be noted that the 20k images are a subset of the train set.

python main_lost.py --dataset COCO20k --set train

Different models

We have tested the method on different setups of the VIT model, corloc results are presented in the following table (more can be found in the paper).

arch pre-training dataset
VOC07 VOC12 COCO20k
ViT-S/16 DINO 61.9 64.0 50.7
ViT-S/8 DINO 55.5 57.0 49.5
ViT-B/16 DINO 60.1 63.3 50.0
ResNet50 DINO 36.8 42.7 26.5
ResNet50 Imagenet 33.5 39.1 25.5


Previous results on the dataset VOC07 can be obtained by launching:

python main_lost.py --dataset VOC07 --set trainval #VIT-S/16
python main_lost.py --dataset VOC07 --set trainval --patch_size 8 #VIT-S/8
python main_lost.py --dataset VOC07 --set trainval --arch vit_base #VIT-B/16
python main_lost.py --dataset VOC07 --set trainval --arch resnet50 #Resnet50/DINO
python main_lost.py --dataset VOC07 --set trainval --arch resnet50_imagenet #Resnet50/imagenet
Comments
  • Is LOST designed to perform well with DINO features specifically?

    Is LOST designed to perform well with DINO features specifically?

    I've replaced LOST's backbone (basically the dino weights) with the ones in CLIP, and it did not work well. But when switching back to dino weights, both ViT and ResNet50 backbone could generate good feature maps. Why would this happen?

    question 
    opened by zengyuy 3
  • Error in evaluation with Detectron2

    Error in evaluation with Detectron2

    Hi @osimeoni,

    Thank you for making the code available!

    When evaluating Detectron2 on VOC12 with the obtained pseudolables. I obtain the following error: AttributeError: "int object has no attribute 'value'. It seems that the coco_style_file is not registered by 'register_coco_instances' (see image underneath). Any idea how this can be fixed? Thanks.

    image

    opened by MarcVisions 2
  • Class-aware detection

    Class-aware detection

    Do you plan on releasing code for class-aware detection (i.e., to produce the results in Table 3 of https://arxiv.org/pdf/2109.14279.pdf)? I don't believe I see any of the necessary code for assigning object categories to boxes, but please correct me if I'm wrong.

    opened by gholste 2
  • Multi-object discovery

    Multi-object discovery

    HI, I have a confusion about the interesting work. How to perform multi-target discovery in the figure 1 (middle) of your paper? Any advice is greatly appreciated.

    question 
    opened by rgbd-zml 1
  • Lost not performing well using DINO with fine-tuning

    Lost not performing well using DINO with fine-tuning

    I’ve trained DINO’s model with my own Dataset, doing a finetuning on the ViT’s pre trained models of DINO. After a feel experiments I noticed that, every time that a epoch of the DINO’s finetune ran, the loss of the training reduce, however the IoU (the validation metric that we are using) of the bounding boxes generated by the LOST algorithm gets worse. Can anyone explain me why this is happening and how can I fix it?

    opened by ericyoshida 1
  • corLoc evaluation

    corLoc evaluation

    Hi @osimeoni . I am suspicious about the corLoc evaluation part in the code. The corLoc for each image is true whenever one of the ground truth objects is hit! https://github.com/valeoai/LOST/blob/2b678aca89c18aa79c56ec3f6d4a0b979a91608d/main_lost.py#L311 What about other objects? Is it right?

    opened by Mirsadeghi 1
Owner
Valeo.ai
The GitHub account of Valeo.ai
Valeo.ai
nn_builder lets you build neural networks with less boilerplate code

nn_builder lets you build neural networks with less boilerplate code. You specify the type of network you want and it builds it. Install pip install n

Petros Christodoulou 157 Nov 20, 2022
Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

Clockwork VAEs in JAX/Flax Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax, ported

Julius Kunze 26 Oct 05, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.

Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers

Amir Shakouri 1 Mar 11, 2022
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

2021: A Year Full of Amazing AI papers- A Review 📌 A curated list of the latest breakthroughs in AI by release date with a clear video explanation, l

Louis-François Bouchard 2.9k Dec 31, 2022
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

Swin Transformer 1.4k Dec 30, 2022
RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

Prasanna Parasurama 3 Nov 02, 2022
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

5 Jan 04, 2023
🔀 Visual Room Rearrangement

AI2-THOR Rearrangement Challenge Welcome to the 2021 AI2-THOR Rearrangement Challenge hosted at the CVPR'21 Embodied-AI Workshop. The goal of this cha

AI2 55 Dec 22, 2022
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 85 Dec 01, 2022
A video scene detection algorithm is designed to detect a variety of different scenes within a video

Scene-Change-Detection - A video scene detection algorithm is designed to detect a variety of different scenes within a video. There is a very simple definition for a scene: It is a series of logical

1 Jan 04, 2022
Parameterising Simulated Annealing for the Travelling Salesman Problem

Parameterising Simulated Annealing for the Travelling Salesman Problem

Gary Sun 55 Jun 15, 2022
QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

QR2Pass This is a proof of concept for an alternative (passwordless) authenticat

4 Dec 09, 2022
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022
GndNet: Fast ground plane estimation and point cloud segmentation for autonomous vehicles using deep neural networks.

GndNet: Fast Ground plane Estimation and Point Cloud Segmentation for Autonomous Vehicles. Authors: Anshul Paigwar, Ozgur Erkent, David Sierra Gonzale

Anshul Paigwar 114 Dec 29, 2022
Prototype-based Incremental Few-Shot Semantic Segmentation

Prototype-based Incremental Few-Shot Semantic Segmentation Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 20

Fabio Cermelli 21 Dec 29, 2022
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021