Pytorch implementation of the unsupervised object discovery method LOST.

Related tags

Deep LearningLOST
Overview

LOST

Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper:

Localizing Objects with Self-Supervised Transformers and no Labels [arXiv]
by Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet and Jean Ponce

LOST visualizations LOST visualizations


If you use the LOST code or framework in your research, please consider citing:

@article{LOST,
   title = {Localizing Objects with Self-Supervised Transformers and no Labels},
   author = {Oriane Sim\'eoni and Gilles Puy and Huy V. Vo and Simon Roburin and Spyros Gidaris and Andrei Bursuc and Patrick P\'erez and Renaud Marlet and Jean Ponce},
   journal = {arXiv preprint arXiv:2109.14279},
   month = {09},
   year = {2021}
}

Installation

Dependencies

This code was implemented with python 3.7, PyTorch 1.7.1 and CUDA 10.2. Please install PyTorch. In order to install the additionnal dependencies, please launch the following command:

pip install -r requirements.txt

Install DINO

This method is based on DINO paper. The framework can be installed using the following commands:

> __init__.py; cd ../; ">
git clone https://github.com/facebookresearch/dino.git
cd dino; 
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;

The code was made using the commit ba9edd1 of DINO repo (please rebase if breakage).

Apply LOST to one image

Following are scripts to apply LOST to an image defined via the image_path parameter and visualize the predictions (pred), the maps of the Figure 2 in the paper (fms) and the visulization of the seed expansion (seed_expansion). Box predictions are also stored in the output directory given by parameter output_dir.

python main_lost.py --image_path examples/VOC07_000236.jpg --visualize pred
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize fms
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize seed_expansion

Launching on datasets

Following are the different steps to reproduce the results of LOST presented in the paper.

PASCAL-VOC

Please download the PASCAL VOC07 and PASCAL VOC12 datasets (link) and put the data in the folder datasets. There should be the two subfolders: datasets/VOC2007 and datasets/VOC2012. In order to apply lost and compute corloc results (VOC07 61.9, VOC12 64.0), please launch:

python main_lost.py --dataset VOC07 --set trainval
python main_lost.py --dataset VOC12 --set trainval

COCO

Please download the COCO dataset and put the data in datasets/COCO. Results are provided given the 2014 annotations following previous works. The following command line allows you to get results on the subset of 20k images of the COCO dataset (corloc 50.7), following previous litterature. To be noted that the 20k images are a subset of the train set.

python main_lost.py --dataset COCO20k --set train

Different models

We have tested the method on different setups of the VIT model, corloc results are presented in the following table (more can be found in the paper).

arch pre-training dataset
VOC07 VOC12 COCO20k
ViT-S/16 DINO 61.9 64.0 50.7
ViT-S/8 DINO 55.5 57.0 49.5
ViT-B/16 DINO 60.1 63.3 50.0
ResNet50 DINO 36.8 42.7 26.5
ResNet50 Imagenet 33.5 39.1 25.5


Previous results on the dataset VOC07 can be obtained by launching:

python main_lost.py --dataset VOC07 --set trainval #VIT-S/16
python main_lost.py --dataset VOC07 --set trainval --patch_size 8 #VIT-S/8
python main_lost.py --dataset VOC07 --set trainval --arch vit_base #VIT-B/16
python main_lost.py --dataset VOC07 --set trainval --arch resnet50 #Resnet50/DINO
python main_lost.py --dataset VOC07 --set trainval --arch resnet50_imagenet #Resnet50/imagenet
Comments
  • Is LOST designed to perform well with DINO features specifically?

    Is LOST designed to perform well with DINO features specifically?

    I've replaced LOST's backbone (basically the dino weights) with the ones in CLIP, and it did not work well. But when switching back to dino weights, both ViT and ResNet50 backbone could generate good feature maps. Why would this happen?

    question 
    opened by zengyuy 3
  • Error in evaluation with Detectron2

    Error in evaluation with Detectron2

    Hi @osimeoni,

    Thank you for making the code available!

    When evaluating Detectron2 on VOC12 with the obtained pseudolables. I obtain the following error: AttributeError: "int object has no attribute 'value'. It seems that the coco_style_file is not registered by 'register_coco_instances' (see image underneath). Any idea how this can be fixed? Thanks.

    image

    opened by MarcVisions 2
  • Class-aware detection

    Class-aware detection

    Do you plan on releasing code for class-aware detection (i.e., to produce the results in Table 3 of https://arxiv.org/pdf/2109.14279.pdf)? I don't believe I see any of the necessary code for assigning object categories to boxes, but please correct me if I'm wrong.

    opened by gholste 2
  • Multi-object discovery

    Multi-object discovery

    HI, I have a confusion about the interesting work. How to perform multi-target discovery in the figure 1 (middle) of your paper? Any advice is greatly appreciated.

    question 
    opened by rgbd-zml 1
  • Lost not performing well using DINO with fine-tuning

    Lost not performing well using DINO with fine-tuning

    I’ve trained DINO’s model with my own Dataset, doing a finetuning on the ViT’s pre trained models of DINO. After a feel experiments I noticed that, every time that a epoch of the DINO’s finetune ran, the loss of the training reduce, however the IoU (the validation metric that we are using) of the bounding boxes generated by the LOST algorithm gets worse. Can anyone explain me why this is happening and how can I fix it?

    opened by ericyoshida 1
  • corLoc evaluation

    corLoc evaluation

    Hi @osimeoni . I am suspicious about the corLoc evaluation part in the code. The corLoc for each image is true whenever one of the ground truth objects is hit! https://github.com/valeoai/LOST/blob/2b678aca89c18aa79c56ec3f6d4a0b979a91608d/main_lost.py#L311 What about other objects? Is it right?

    opened by Mirsadeghi 1
Owner
Valeo.ai
The GitHub account of Valeo.ai
Valeo.ai
Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Polypharmacy - DDI - Synergy Survey The Survey Paper This repository accompanies our survey paper A Unified View of Relational Deep Learning for Polyp

AstraZeneca 79 Jan 05, 2023
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022
Official Pytorch implementation of Meta Internal Learning

Official Pytorch implementation of Meta Internal Learning

10 Aug 24, 2022
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Photogrammetry & Robotics Bonn 701 Dec 30, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
ChebLieNet, a spectral graph neural network turned equivariant by Riemannian geometry on Lie groups.

ChebLieNet: Invariant spectral graph NNs turned equivariant by Riemannian geometry on Lie groups Hugo Aguettaz, Erik J. Bekkers, Michaël Defferrard We

haguettaz 12 Dec 10, 2022
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
I tried to apply the CAM algorithm to YOLOv4 and it worked.

YOLOV4:You Only Look Once目标检测模型在pytorch当中的实现 2021年2月7日更新: 加入letterbox_image的选项,关闭letterbox_image后网络的map得到大幅度提升。 目录 性能情况 Performance 实现的内容 Achievement

55 Dec 05, 2022
DecoupledNet is semantic segmentation system which using heterogeneous annotations

DecoupledNet: Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation Created by Seunghoon Hong, Hyeonwoo Noh and Bohyung Han at POSTE

Hyeonwoo Noh 74 Sep 22, 2021
Official Pytorch implementation of RePOSE (ICCV2021)

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link] Abstract We present RePOSE, a fast iterative refinement method fo

Shun Iwase 68 Nov 15, 2022
Audio2Face - Audio To Face With Python

Audio2Face Discription We create a project that transforms audio to blendshape w

FACEGOOD 724 Dec 26, 2022
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022
Free like Freedom

This is all very much a work in progress! More to come! ( We're working on it though! Stay tuned!) Installation Open an Anaconda Prompt (in Windows, o

2.3k Jan 04, 2023
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding

202 Nov 18, 2022
TinyML Cookbook, published by Packt

TinyML Cookbook This is the code repository for TinyML Cookbook, published by Packt. Author: Gian Marco Iodice Publisher: Packt About the book This bo

Packt 93 Dec 29, 2022
A series of Jupyter notebooks with Chinese comment that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

Hands-on-Machine-Learning 目的 这份笔记旨在帮助中文学习者以一种较快较系统的方式入门机器学习, 是在学习Hands-on Machine Learning with Scikit-Learn and TensorFlow这本书的 时候做的个人笔记: 此项目的可取之处 原书的

Baymax 1.5k Dec 21, 2022
Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

AutoInt: Automatic Integration for Fast Neural Volume Rendering CVPR 2021 Project Page | Video | Paper PyTorch implementation of automatic integration

Stanford Computational Imaging Lab 149 Dec 22, 2022
Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

ApproxMVBB Status Build UnitTests Homepage Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in

Gabriel Nützi 390 Dec 31, 2022
A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal, but extensible training loop which is flexible enough to handle the majority of use cases,

Chris Hughes 110 Dec 23, 2022
[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

Context Encoders: Feature Learning by Inpainting CVPR 2016 [Project Website] [Imagenet Results] Sample results on held-out images: This is the trainin

Deepak Pathak 829 Dec 31, 2022