This repository contains all code and data for the Inside Out Visual Place Recognition task

Related tags

Deep LearningIOVPR
Overview

Inside Out Visual Place Recognition

This repository contains code and instructions to reproduce the results for the Inside Out Visual Place Recognition task and to retrieve the dataset Amsterdam-XXXL. Details are described in our [paper] and [supplementary material]

Dataset

Our dataset Amsterdam-XXXL consists of 3 partitions:

  • Outdoor-Ams: A set of 6.4M GPS annotated street-view images, meant for evaluation purposes but can be used for training as well.
  • Indoor-Ams: 2 sets of 500 indoor images each, that are used as queries during evaluation
  • Ams30k: A small set of GPS annotated street-view images, modelled after Pitts30k, that can be used for training purposes.

Contact [email protected] to get access to the dataset.

Code

This code is based on the code of 'Self-supervising Fine-grained Region Similarities for Large-scale Image Localization (SFRS)' [paper] from https://github.com/yxgeee/OpenIBL.

Main Modifications

  • It is able to process the dataset files for IOVPR.
  • It is able to evaluate on the large scale dataset Outdoor-Ams.
  • It uses Faiss for faster evaluation.

Requirements

  • Follow the installation instructions on https://github.com/yxgeee/OpenIBL/blob/master/docs/INSTALL.md
  • You can use the conda environment iovpr.yml as provided in this repo.
  • Training on Ams30k requires 4 GPUs. Evaluation on Ams30k can be done on 1 GPU. For evaluating on the full Outdoor-Ams, we used a node with 8 GeForce GTX 1080 Ti GPUs. A node with 4 GPUs is not sufficient and will cause memory issues.

Inside Out Data Augmentation

Data processing

In our pipeline we use real and gray layouts to train our models. To create real and gray lay outs we use the ADE20k dataset that can be obtained from http://sceneparsing.csail.mit.edu. This dataset is meant for semantic segmentation and therefore annotated on pixel level, with 150 semantic categories. We select indoor images from the train and validation set. Since 1 of the 150 semantic categories is 'window', we create binary masks of window and non-window pixels of each image. This binary mask is used to create real and gray layouts, as described in our paper. We create three sets of at least 10%, 20% and 30% window pixels.

Inference

During inference with gray layouts, we need a semantic segmentation network. For this, we use the code from https://github.com/CSAILVision/semantic-segmentation-pytorch. We use the pretrained UperNet50 model and finetune the model with the help of the ADE20k dataset on two output classes, window and non-window. The code in this link need some small modifications to finetune it on two classes.

Training and evaluating our models

Details on how to train the models can be found here: https://github.com/yxgeee/OpenIBL/blob/master/docs/REPRODUCTION.md. Only adapt the dataset(=Ams) and scale(=30k).

For evaluation, we use test_faiss.sh.

Ams30k:

./scripts/test_faiss.sh <PATH TO MODEL> ams 30k <PATH TO STORE FEATURES> <FEATURE_FILE_NAME>

Outdoor-Ams:

./scripts/test_faiss.sh <PATH TO MODEL> ams outdoor <PATH TO STORE FEATURES> <FEATURE_FILE_NAME>

Note that this uses faiss_evaluators.py instead of the original evaluators.py.

License

'IOVPR' is released under the MIT license.

Citation

If you work on the Inside Out Visual Place Recognition or use our large scale dataset for regular Visual Place Recognition, please cite our paper.

@inproceedings{iovpr2021,
    title={Inside Out Visual Place Recognition},
    author={Sarah Ibrahimi and Nanne van Noord and Tim Alpherts and Marcel Worring},
    booktitle={BMVC}
    year={2021},
}

Acknowledgements

This repo is an extension of SFRS, which is inspired by open-reid, and part of the code is inspired by pytorch-NetVlad.

Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 07, 2023
A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey pap

svandenh 297 Dec 17, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
SciFive: a text-text transformer model for biomedical literature

SciFive SciFive provided a Text-Text framework for biomedical language and natural language in NLP. Under the T5's framework and desrbibed in the pape

Long Phan 54 Dec 24, 2022
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 01, 2022
Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors

Gas detection Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors. Description The MQ-2 sensor can detect multiple gases (CO, H2, CH4, LPG,

Filip Š 15 Sep 30, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

9 Nov 14, 2022
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
Implementation of [Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes].

Time2box Implementation of [Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes].

LingCai 4 Aug 23, 2022
This repository contains the reference implementation for our proposed Convolutional CRFs.

ConvCRF This repository contains the reference implementation for our proposed Convolutional CRFs in PyTorch (Tensorflow planned). The two main entry-

Marvin Teichmann 553 Dec 07, 2022
Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Segmentation from Natural Language Expressions This repository contains the code for the following paper: R. Hu, M. Rohrbach, T. Darrell, Segmentation

Ronghang Hu 88 May 24, 2022
使用yolov5训练自己数据集(详细过程)并通过flask部署

使用yolov5训练自己的数据集(详细过程)并通过flask部署 依赖库 torch torchvision numpy opencv-python lxml tqdm flask pillow tensorboard matplotlib pycocotools Windows,请使用 pycoc

HB.com 19 Dec 28, 2022
PyTorch implementation of the TTC algorithm

Trust-the-Critics This repository is a PyTorch implementation of the TTC algorithm and the WGAN misalignment experiments presented in Trust the Critic

0 Nov 29, 2021
Easily pull telemetry data and create beautiful visualizations for analysis.

This repository is a work in progress. Anything and everything is subject to change. Porpo Table of Contents Porpo Table of Contents General Informati

Ryan Dawes 33 Nov 30, 2022
Entity-Based Knowledge Conflicts in Question Answering.

Entity-Based Knowledge Conflicts in Question Answering Run Instructions | Paper | Citation | License This repository provides the Substitution Framewo

Apple 35 Oct 19, 2022
Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

InterDigital 63 Jan 03, 2023
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
pq is a jq-like Pickle file viewer

pq PQ is a jq-like viewer/processing tool for pickle files. howto # pq '' file.pkl {'other': 456, 'test': 123} # pq 'table' file.pkl |other|test| | 45

3 Mar 15, 2022
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

ON-LSTM This repository contains the code used for word-level language model and unsupervised parsing experiments in Ordered Neurons: Integrating Tree

Yikang Shen 572 Nov 21, 2022