Joint deep network for feature line detection and description

Related tags

Deep LearningSOLD2
Overview

SOLD² - Self-supervised Occlusion-aware Line Description and Detection

This repository contains the implementation of the paper: SOLD² : Self-supervised Occlusion-aware Line Description and Detection, J-T. Lin*, R. Pautrat*, V. Larsson, M. Oswald and M. Pollefeys (Oral at CVPR 2021).

SOLD² is a deep line segment detector and descriptor that can be trained without hand-labelled line segments and that can robustly match lines even in the presence of occlusion.

Demos

Matching in the presence of occlusion: demo_occlusion

Matching with a moving camera: demo_moving_camera

Usage

Installation

We recommend using this code in a Python environment (e.g. venv or conda). The following script installs the necessary requirements with pip:

pip install -r requirements.txt

Set your dataset and experiment paths (where you will store your datasets and checkpoints of your experiments) by modifying the file config/project_config.py. Both variables DATASET_ROOT and EXP_PATH have to be set.

You can download the version of the Wireframe dataset that we used during our training and testing here. This repository also includes some files to train on the Holicity dataset to add more outdoor images, but note that we did not extensively test this dataset and the original paper was based on the Wireframe dataset only.

Training your own model

All training parameters are located in configuration files in the folder config. Training SOLD² from scratch requires several steps, some of which taking several days, depending on the size of your dataset.

Step 1: Train on a synthetic dataset

The following command will create the synthetic dataset and start training the model on it:

python experiment.py --mode train --dataset_config config/synthetic_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_synth
Step 2: Export the raw pseudo ground truth on the Wireframe dataset with homography adaptation

Note that this step can take one to several days depending on your machine and on the size of the dataset. You can set the batch size to the maximum capacity that your GPU can handle.

python experiment.py --exp_name wireframe_train --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode train --export_batch_size 4

You can similarly perform the same for the test set:

python experiment.py --exp_name wireframe_test --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode test --export_batch_size 4
Step3: Compute the ground truth line segments from the raw data
cd postprocess
python convert_homography_results.py <name of the previously exported file (e.g. "wireframe_train.h5")> <name of the new data with extracted line segments (e.g. "wireframe_train_gt.h5")> ../config/export_line_features.yaml
cd ..

We recommend testing the results on a few samples of your dataset to check the quality of the output, and modifying the hyperparameters if need be. Using a detect_thresh=0.5 and inlier_thresh=0.99 proved to be successful for the Wireframe dataset in our case for example.

Step 4: Train the detector on the Wireframe dataset

We found it easier to pretrain the detector alone first, before fine-tuning it with the descriptor part. Uncomment the lines 'gt_source_train' and 'gt_source_test' in config/wireframe_dataset.yaml and fill them with the path to the h5 file generated in the previous step.

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe

Alternatively, you can also fine-tune the already trained synthetic model:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_synth> --checkpoint_name <name of the best checkpoint>

Lastly, you can resume a training that was stopped:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --resume --resume_path <path to the  model to resume> --checkpoint_name <name of the last checkpoint>
Step 5: Train the full pipeline on the Wireframe dataset

You first need to modify the field 'return_type' in config/wireframe_dataset.yaml to 'paired_desc'. The following command will then train the full model (detector + descriptor) on the Wireframe dataset:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_full_pipeline.yaml --exp_name sold2_full_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_wireframe> --checkpoint_name <name of the best checkpoint>

Pretrained models

We provide the checkpoints of two pretrained models:

How to use it

We provide a notebook showing how to use the trained model of SOLD². Additionally, you can use the model to export line features (segments and descriptor maps) as follows:

python export_line_features.py --img_list <list to a txt file containing the path to all the images> --output_folder <path to the output folder> --checkpoint_path <path to your best checkpoint,>

You can tune some of the line detection parameters in config/export_line_features.yaml, in particular the 'detect_thresh' and 'inlier_thresh' to adapt them to your trained model and type of images.

Results

Comparison of repeatability and localization error to the state of the art on the Wireframe dataset for an error threshold of 5 pixels in structural and orthogonal distances:

Structural distance Orthogonal distance
Rep-5 Loc-5 Rep-5 Loc-5
LCNN 0.434 2.589 0.570 1.725
HAWP 0.451 2.625 0.537 1.725
DeepHough 0.419 2.576 0.618 1.720
TP-LSD TP512 0.563 2.467 0.746 1.450
LSD 0.358 2.079 0.707 0.825
Ours with NMS 0.557 1.995 0.801 1.119
Ours 0.616 2.019 0.914 0.816

Matching precision-recall curves on the Wireframe and ETH3D datasets: pred_lines_pr_curve

Bibtex

If you use this code in your project, please consider citing the following paper:

@InProceedings{Pautrat_Lin_2021_CVPR,
    author = {Pautrat, Rémi* and Juan-Ting, Lin* and Larsson, Viktor and Oswald, Martin R. and Pollefeys, Marc},
    title = {SOLD²: Self-supervised Occlusion-aware Line Description and Detection},
    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    year = {2021},
}
Owner
Computer Vision and Geometry Lab
Computer Vision and Geometry Lab
SmoothGrad implementation in PyTorch

SmoothGrad implementation in PyTorch PyTorch implementation of SmoothGrad: removing noise by adding noise. Vanilla Gradients SmoothGrad Guided backpro

SSKH 143 Jan 05, 2023
HyDiff: Hybrid Differential Software Analysis

HyDiff: Hybrid Differential Software Analysis This repository provides the tool and the evaluation subjects for the paper HyDiff: Hybrid Differential

Yannic Noller 22 Oct 20, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 07, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
Codeflare - Scale complex AI/ML pipelines anywhere

Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics

CodeFlare 169 Nov 29, 2022
(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) 👚 [Paper] 👖 [Webpage] 👗 [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

Aiyu Cui 277 Dec 28, 2022
Real-time Neural Representation Fusion for Robust Volumetric Mapping

NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric Mapping Paper | Supplementary This repository contains the implementation of

ETHZ ASL 106 Dec 24, 2022
Official implementation of EfficientPose

EfficientPose This is the official implementation of EfficientPose. We based our work on the Keras EfficientDet implementation xuannianz/EfficientDet

2 May 17, 2022
Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks Official implementation of paper Towards Practic

Xiangyu Qi 8 Dec 30, 2022
Low Complexity Channel estimation with Neural Network Solutions

Interpolation-ResNet Invited paper for WSA 2021, called 'Low Complexity Channel estimation with Neural Network Solutions'. Low complexity residual con

Dianxin 10 Dec 10, 2022
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
Face Recognition Attendance Project

Face-Recognition-Attendance-Project In This Project You will learn how to mark attendance using face recognition, Hello Guys This is Gautam Kumar, Thi

Gautam Kumar 1 Dec 03, 2022
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

Qibin He 6 Nov 25, 2022
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

AugMix Introduction We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented ima

Google Research 876 Dec 17, 2022
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

82 Dec 15, 2022
Image classification for projects and researches

This is a tool to help you quickly solve classification problems including: data analysis, training, report results and model explanation.

Nguyễn Trường Lâu 2 Dec 27, 2021
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
SmartSim Infrastructure Library.

Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten

Cray Labs 139 Jan 01, 2023
Garbage Detection system which will detect objects based on whether it is plastic waste or plastics or just garbage.

Garbage Detection using Yolov5 on Jetson Nano 2gb Developer Kit. Garbage detection system which will detect objects based on whether it is plastic was

Rishikesh A. Bondade 2 May 13, 2022
A Python module for parallel optimization of expensive black-box functions

blackbox: A Python module for parallel optimization of expensive black-box functions What is this? A minimalistic and easy-to-use Python module that e

Paul Knysh 426 Dec 08, 2022