Contains code for Deep Kernelized Dense Geometric Matching

Related tags

Deep LearningDKM
Overview

DKM - Deep Kernelized Dense Geometric Matching

Contains code for Deep Kernelized Dense Geometric Matching

We provide pretrained models and code for evaluation and running on your own images. We do not curently provide code for training models, but you can basically copy paste the model code into your own training framework and run it.

Note that the performance of current models is greater than in the pre-print. This is due to continued development since submission.

Install

Run pip install -e .

Using a (Pretrained) Model

Models can be imported by:

from dkm import dkm_base
model = dkm_base(pretrained=True, version="v11")

This creates a model, and loads pretrained weights.

Running on your own images

from dkm import dkm_base
from PIL import Image
model = dkm_base(pretrained=True, version="v11")
im1, im2 = Image.open("im1.jpg"), Image.open("im2.jpg")
# Note that matches are produced in the normalized grid [-1, 1] x [-1, 1] 
dense_matches, dense_certainty = model.match(im1, im2)
# You may want to process these, e.g. we found dense_certainty = dense_certainty.sqrt() to work quite well in some cases.
# Sample 10000 sparse matches
sparse_matches, sparse_certainty = model.sample(dense_matches, dense_certainty, 10000)

Downloading Benchmarks

HPatches

First, make sure that the "data/hpatches" path exists. I usually prefer to do this by:

ln -s place/where/your/datasets/are/stored/hpatches data/hpatches

Then run (if you don't already have hpatches downloaded) bash scripts/download_hpatches.sh

Yfcc100m (OANet Split)

We use the split introduced by OANet, this split can be found from e.g. https://github.com/PruneTruong/DenseMatching

Megadepth (LoFTR Split)

Currently we do not support the LoFTR split, as we trained on one of the scenes used there. Future releases may support this split, stay tuned.

Scannet (SuperGlue Split)

We use the same split of scannet as superglue. LoFTR provides the split here: https://drive.google.com/drive/folders/1nTkK1485FuwqA0DbZrK2Cl0WnXadUZdc

Evaluation

Here we provide approximate performance numbers for DKM using this codebase. Note that the randomness involved in geometry estimation means that the numbers are not exact. (+- 0.5 typically)

HPatches

To evaluate on HPatches Homography Estimation, run:

from dkm import dkm_base
from dkm.benchmarks import HpatchesHomogBenchmark

model = dkm_base(pretrained=True, version="v11")
homog_benchmark = HpatchesHomogBenchmark("data/hpatches")
homog_benchmark.benchmark_hpatches(model)

Results

HPatches Homography Estimation

AUC
@3px @5px @10px
LoFTR (CVPR'21) 65.9 75.6 84.6
DKM (Ours) 71.2 80.6 88.7

Scannet Pose Estimation

Here we compare the performance on Scannet of models not trained on Scannet. (For reference we also include the version LoFTR specifically trained on Scannet)

AUC mAP
@5 @10 @20 @5 @10 @20
SuperGlue (CVPR'20) Trained on Megadepth 16.16 33.81 51.84 - - -
LoFTR (CVPR'21) Trained on Megadepth 16.88 33.62 50.62 - - -
LoFTR (CVPR'21) Trained on Scannet 22.06 40.8 57.62 - - -
PDCNet (CVPR'21) Trained on Megadepth 17.70 35.02 51.75 39.93 50.17 60.87
PDCNet+ (Arxiv) Trained on Megadepth 19.02 36.90 54.25 42.93 53.13 63.95
DKM (Ours) Trained on Megadepth 22.3 42.0 60.2 48.4 59.5 70.3
DKM (Ours) Trained on Megadepth Square root Confidence Sampling 22.9 43.6 61.4 51.2 62.1 72.0

Yfcc100m Pose Estimation

Here we compare to recent methods using a single forward pass. PDC-Net+ using multiple passes comes closer to our method, reaching AUC-5 of 37.51. However, comparing to that method is somewhat unfair as their inference is much slower.

AUC mAP
@5 @10 @20 @5 @10 @20
PDCNet (CVPR'21) 32.21 52.61 70.13 60.52 70.91 80.30
PDCNet+ (Arxiv) 34.76 55.37 72.55 63.93 73.81 82.74
DKM (Ours) 40.0 60.2 76.2 69.8 78.5 86.1

TODO

  • Add Model Code
  • Upload Pretrained Models
  • Add HPatches Homography Benchmark
  • Add More Benchmarks

Acknowledgement

We have used code and been inspired by (among others) https://github.com/PruneTruong/DenseMatching , https://github.com/zju3dv/LoFTR , and https://github.com/GrumpyZhou/patch2pix

BibTeX

If you find our models useful, please consider citing our paper!

@article{edstedt2022deep,
  title={Deep Kernelized Dense Geometric Matching},
  author={Edstedt, Johan and Wadenb{\"a}ck, M{\aa}rten and Felsberg, Michael},
  journal={arXiv preprint arXiv:2202.00667},
  year={2022}
}
Owner
Johan Edstedt
PhD Student at CVL LiU.
Johan Edstedt
Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes This repository is the official implementation of Us

Damien Bouchabou 0 Oct 18, 2021
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal

Microsoft 2.6k Jan 03, 2023
The original implementation of TNDM used in the NeurIPS 2021 paper (no longer being updated)

TNDM - Targeted Neural Dynamical Modeling Note: This code is no longer being updated. The official re-implementation can be found at: https://github.c

1 Jul 21, 2022
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

OCTIS : Optimizing and Comparing Topic Models is Simple! OCTIS (Optimizing and Comparing Topic models Is Simple) aims at training, analyzing and compa

MIND 478 Jan 01, 2023
ML model to classify between cats and dogs

Cats-and-dogs-classifier This is my first ML model which can classify between cats and dogs. Here the accuracy is around 75%, however , the accuracy c

Sharath V 4 Aug 20, 2021
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

CoMIR: Contrastive Multimodal Image Representation for Registration Framework 🖼 Registration of images in different modalities with Deep Learning 🤖

Methods for Image Data Analysis - MIDA 55 Dec 09, 2022
Low-dose Digital Mammography with Deep Learning

Impact of loss functions on the performance of a deep neural network designed to restore low-dose digital mammography ====== This repository contains

WANG-AXIS 6 Dec 13, 2022
BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
Diffgram - Supervised Learning Data Platform

Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning

Diffgram 1.6k Jan 07, 2023
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 03, 2023
Small utility to demangle Nim symbols in callgrind files

nim_callgrind A small utility to demangle Nim symbols from callgrind files. Usage Run your (Nim) program with something like this: valgrind --tool=cal

kraptor 3 Feb 15, 2022
Learning Saliency Propagation for Semi-supervised Instance Segmentation

Learning Saliency Propagation for Semi-supervised Instance Segmentation PyTorch Implementation This repository contains: the PyTorch implementation of

Berkeley DeepDrive 68 Oct 18, 2022
This is a clean and robust Pytorch implementation of DQN and Double DQN.

DQN/DDQN-Pytorch This is a clean and robust Pytorch implementation of DQN and Double DQN. Here is the training curve: All the experiments are trained

XinJingHao 15 Dec 27, 2022
CS550 Machine Learning course project on CNN Detection.

CNN Detection (CS550 Machine Learning Project) Team Members (Tensor) : Yadava Kishore Chodipilli (11940310) Thashmitha BS (11941250) This is a work do

yaadava_kishore 2 Jan 30, 2022
A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

Tencent 956 Jan 01, 2023
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

Takaaki Saeki 46 Jan 07, 2023
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs

Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs ArXiv Abstract Convolutional Neural Networks (CNNs) have become the de f

Philipp Benz 12 Oct 24, 2022