ScaleNet: A Shallow Architecture for Scale Estimation

Related tags

Deep LearningScaleNet
Overview

ScaleNet: A Shallow Architecture for Scale Estimation

Repository for the code of ScaleNet paper:

"ScaleNet: A Shallow Architecture for Scale Estimation".
Axel Barroso-Laguna, Yurun Tian, and Krystian Mikolajczyk. arxiv 2021.

[Paper on arxiv]

Prerequisite

Python 3.7 is required for running and training ScaleNet code. Use Conda to install the dependencies:

conda create --name scalenet_env
conda activate scalenet_env 
conda install pytorch==1.2.0 -c pytorch
conda install -c conda-forge tensorboardx opencv tqdm 
conda install -c anaconda pandas 
conda install -c pytorch torchvision 

Scale estimation

run_scalenet.py can be used to estimate the scale factor between two input images. We provide as an example two images, im1.jpg and im2.jpg, within the assets/im_test folder as an example. For a quick test, please run:

python run_scalenet.py --im1_path assets/im_test/im1.jpg --im2_path assets/im_test/im2.jpg

Arguments:

  • im1_path: Path to image A.
  • im2_path: Path to image B.

It returns the scale factor A->B.

Training ScaleNet

We provide a list of Megadepth image pairs and scale factors in the assets folder. We use the undistorted images, corresponding camera intrinsics, and extrinsics preprocessed by D2-Net. You can download them directly from their main repository. If you desire to use the default configuration for training, just run the following line:

python train_ScaleNet.py --image_data_path /path/to/megadepth_d2net

There are though some important arguments to take into account when training ScaleNet.

Arguments:

  • image_data_path: Path to the undistorted Megadepth images from D2-Net.
  • save_processed_im: ScaleNet processes the images so that they are center-cropped and resized to a default resolution. We give the option to store the processed images and load them during training, which results in a much faster training. However, the size of the files can be big, and hence, we suggest storing them in a large storage disk. Default: True.
  • root_precomputed_files: Path to save the processed image pairs.

If you desire to modify ScaleNet training or architecture, look for all the arguments in the train_ScaleNet.py script.

Test ScaleNet - camera pose

In addition to the training, we also provide a template for testing ScaleNet in the camera pose task. In assets/data/test.csv, you can find the test Megadepth pairs, along with their scale change as well as their camera poses.

Run the following command to test ScaleNet + SIFT in our custom camera pose split:

python test_camera_pose.py --image_data_path /path/to/megadepth_d2net

camera_pose.py script is intended to provide a structure of our camera pose experiment. You can change either the local feature extractor or the scale estimator and obtain your camera pose results.

BibTeX

If you use this code or the provided training/testing pairs in your research, please cite our paper:

@InProceedings{Barroso-Laguna2021_scale,
    author = {Barroso-Laguna, Axel and Tian, Yurun and Mikolajczyk, Krystian},
    title = {{ScaleNet: A Shallow Architecture for Scale Estimation}},
    booktitle = {Arxiv: },
    year = {2021},
}
Owner
Axel Barroso
Computer Vision PhD Student
Axel Barroso
Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

Vinicius Senger 5 Nov 30, 2022
The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

QUT Centre for Robotics (QCR) 13 Nov 26, 2022
DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction This is the implementation of DeepSTD in

5 Sep 26, 2022
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Rowel Atienza 198 Dec 27, 2022
ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

Gengshan Yang 59 Nov 25, 2022
Largest list of models for Core ML (for iOS 11+)

Since iOS 11, Apple released Core ML framework to help developers integrate machine learning models into applications. The official documentation We'v

Kedan Li 5.6k Jan 08, 2023
code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning Overview This code is for paper: Not All Unlabeled Data are Equa

Jason Ren 22 Nov 23, 2022
Benchmarking Pipeline for Prediction of Protein-Protein Interactions

B4PPI Benchmarking Pipeline for the Prediction of Protein-Protein Interactions How this benchmarking pipeline has been built, and how to use it, is de

Loïc Lannelongue 4 Jun 27, 2022
A pyparsing-based library for parsing SOQL statements

CONTRIBUTORS WANTED!! Installation pip install python-soql-parser or, with poetry poetry add python-soql-parser Usage from python_soql_parser import p

Kicksaw 0 Jun 07, 2022
PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order

decile-team 244 Jan 09, 2023
Discovering and Achieving Goals via World Models

Discovering and Achieving Goals via World Models [Project Website] [Benchmark Code] [Video (2min)] [Oral Talk (13min)] [Paper] Russell Mendonca*1, Ole

Oleg Rybkin 71 Dec 22, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Julia package for multiway (inverse) covariance estimation.

TensorGraphicalModels TensorGraphicalModels.jl is a suite of Julia tools for estimating high-dimensional multiway (tensor-variate) covariance and inve

Wayne Wang 3 Sep 23, 2022
African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 05, 2023
A general, feasible, and extensible framework for classification tasks.

Pytorch Classification A general, feasible and extensible framework for 2D image classification. Features Easy to configure (model, hyperparameters) T

Eugene 26 Nov 22, 2022
Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

GPT-Neo-2.7B Fine-Tuning Example Using HuggingFace & DeepSpeed Installation cd venv/bin ./pip install -r ../../requirements.txt ./pip install deepspe

Nikita 180 Jan 05, 2023
For IBM Quantum Challenge Africa 2021, 9 September (07:00 UTC) - 20 September (23:00 UTC).

IBM Quantum Challenge Africa 2021 To ensure Africa is able to apply quantum computing to solve problems relevant to the continent, the IBM Research La

Qiskit Community 48 Dec 25, 2022
The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

DER.ClassIL.Pytorch This repo is the official implementation of DER: Dynamically Expandable Representation for Class Incremental Learning (CVPR 2021)

rhyssiyan 108 Jan 01, 2023
SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning This repository is the official implementation of "SHRIMP: Sparser Random Featur

Bobby Shi 0 Dec 16, 2021