Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Last update: Dec 27, 2022

Related tags

Overview

Line as a Visual Sentence with LineTR

This repository contains the inference code, pretrained model, and demo scripts of the following paper. It supports both point(SuperPoint) and line features(LSD+LineTR).

@article{syoon-2021-linetr,
  author    = {Sungho Yoon and Ayoung Kim},
  title     = {{Line as a Visual Sentence}: Context-aware Line Descriptor for Visual Localization},
  booktitle = {IEEE Robotics and Automation Letters},
  year      = {2021}
}

Abstract

Along with feature points for image matching, line features provide additional constraints to solve visual geometric problems in robotics and computer vision (CV). Although recent convolutional neural network (CNN)-based line descriptors are promising for viewpoint changes or dynamic environments, we claim that the CNN architecture has innate disadvantages to abstract variable line length into the fixed-dimensional descriptor. In this paper, we effectively introduce the Line-Transformer dealing with variable lines. Inspired by natural language processing (NLP) tasks where sentences can be understood and abstracted well in neural nets, we view a line segment as a sentence that contains points (words). By attending to well-describable points on a line dynamically, our descriptor performs excellently on variable line length. We also propose line signature networks sharing the line's geometric attributes to neighborhoods. Performing as group descriptors, the networks enhance line descriptors by understanding lines' relative geometries. Finally, we present the proposed line descriptor and matching in a Point and Line Localization (PL-Loc). We show that the visual localization with feature points can be improved using our line features. We validate the proposed method for homography estimation and visual localization.

Getting Started

This code was tested with Python 3.6 and PyTorch 1.8 on Ubuntu 18.04.

# create and activate a new conda environment
conda create -y --name linetr
conda activate linetr

# install the dependencies
conda install -y python=3.6
pip install -r requirements.txt

Command

There are two demo scripts:

demo_LineTR.py : run a live demo on a camera or video file
match_line_pairs.py : find line correspondence for image pairs, listed in input_pairs.txt

Keyboard control:

n: select the current frame as the anchor
e/r: increase/decrease the keypoint confidence threshold
d/f: increase/decrease the nearest neighbor matching threshold for keypoints
c/v: increase/decrease the nearest neighbor matching threshold for keylines
k: toggle the visualization of keypoints
q: quit

The scripts are partially reusing SuperGluePretrainedNetwork.

BibTeX Citation

@ARTICLE{syoon-2021-linetr,
  author    = {Sungho Yoon and Ayoung Kim},
  title     = {{Line as a Visual Sentence}: Context-aware Line Descriptor for Visual Localization},
  booktitle = {IEEE Robotics and Automation Letters},
  year      = {2021},
  url       = {https://arxiv.org/abs/2109.04753}
}

Acknowledgment

This work was fully supported by [Localization in changing city] project funded by NAVER LABS Corporation.

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Related tags

Overview

Line as a Visual Sentence with LineTR

Abstract

Getting Started

Command

BibTeX Citation

Acknowledgment

Owner

SungHo Yoon

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Trex is a tool to match semantically similar functions based on transfer learning.

An A-SOUL Text Generator Based on CPM-Distill.

This is a NLP based project to extract effective date of the contract from their text files.

Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.

ADCS cert template modification and ACL enumeration

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

端到端的长本文摘要模型（法研杯2020司法摘要赛道）

Speech Recognition for Uyghur using Speech transformer

Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

Dé op-de-vlucht Pieton vertaler. Wereldwijd gebruikt door meer dan 1.000+ succesvolle bedrijven!

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

Code associated with the Don't Stop Pretraining ACL 2020 paper

A PyTorch implementation of VIOLET

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself