Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Last update: Dec 10, 2022

Overview

SegSwap

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

If our project is helpful for your research, please consider citing :

@article{shen2021learning,
  title={Learning Co-segmentation by Segment Swapping for Retrieval and Discovery},
  author={Shen, Xi and Efros, Alexei A and Joulin, Armand and Aubry, Mathieu},
  journal={arXiv},
  year={2021}

1. Installation

1.1. Dependencies

Our model can be learnt on a a single GPU Tesla-V100-16GB. The code has been tested in Pytorch 1.7.1 + cuda 10.2

Other dependencies can be installed via (tqdm, kornia, opencv-python, scipy) :

bash requirement.sh

1.2. Pre-trained MocoV2-resnet50 + cross-transformer (~300M)

Quick download :

cd model/pretrained
bash download_model.sh

2. Training Data Generation

2.1. Download COCO (~20G)

This command will download coco2017 training set + annotations (~20G).

cd data/COCO2017/download_coco.sh
bash download_coco.sh

2.2. Image Pairs with One Repeated Object

2.2.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_1obj.py --out-dir pairs_1obj_100k

2.2.1 Examples of image pairs

Source	Blended Obj + Background	Stylised Source	Stylised Background

2.2.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_1obj/vis.html

2.3. Image Pairs with Two Repeated Object

2.3.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_2obj.py --out-dir pairs_2obj_100k

2.3.1 Examples of image pairs

Source	Blended Obj + Background	Stylised Source	Stylised Background

2.3.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_2obj/vis.html

3. Evaluation

3.1 One-shot Art Detail Detection on Brueghel Dataset

3.1.1 Visual results: top-3 retrieved images

3.1.2 Data

Brueghel dataset has been uploaded in this repo

3.1.3 Quantitative results

The following command conduct evaluation on Brueghel with pre-trained cross-transformer:

cd evalBrueghel
python evalBrueghel.py --out-coarse out_brueghel.json --resume-pth ../model/hard_mining_neg5.pth --label-pth ../data/Brueghel/brueghelTest.json

Note that this command will save the features of Brueghel(~10G).

3.2 Place Recognition on Tokyo247 Dataset

3.2.1 Visual results: top-3 retrieved images

3.2.2 Data

Download Tokyo247 from its project page

Download the top-100 results used by patchVlad(~1G).

The data needs to be organised:

./SegSwap/data/Tokyo247
                    ├── query/
                        ├── 247query_subset_v2/
                    ├── database/
...

./SegSwap/evalTokyo
                    ├── top100_patchVlad.npy

3.2.3 Quantitative results

The following command conduct evaluation on Tokyo247 with pre-trained cross-transformer:

cd evalTokyo
python evalTokyo.py --qry-dir ../data/Tokyo247/query/247query_subset_v2 --db-dir ../data/Tokyo247/database --resume-pth ../model/hard_mining_neg5.pth

3.3 Place Recognition on Pitts30K Dataset

3.3.1 Visual results: top-3 retrieved images

3.3.2 Data

Download Pittsburgh dataset from its project page

Download the top-100 results used by patchVlad (~4G).

The data needs to be organised:

./SegSwap/data/Pitts
                ├── queries_real/
...

./SegSwap/evalPitts
                    ├── top100_patchVlad.npy

3.3.3 Quantitative results

The following command conduct evaluation on Pittsburgh30K with pre-trained cross-transformer:

cd evalPitts
python evalPitts.py --qry-dir ../data/Pitts/queries_real --db-dir ../data/Pitts --resume-pth ../model/hard_mining_neg5.pth

3.4 Discovery on Internet Dataset

3.4.1 Visual results

3.4.2 Data

Download Internet dataset from its project page

We provide a script to quickly download and preprocess the data (~400M):

cd data/Internet
bash download_int.sh

The data needs to be organised:

./SegSwap/data/Internet
                ├── Airplane100
                    ├── GroundTruth                
                ├── Horse100
                    ├── GroundTruth                
                ├── Car100
                    ├── GroundTruth

3.4.3 Quantitative results

The following commands conduct evaluation on Internet with pre-trained cross-transformer

cd evalInt
bash run_pair_480p.sh
bash run_best_only_cycle.sh

4. Training

Stage 1: standard training

Supposing that the generated pairs are saved in ./SegSwap/data/pairs_1obj_100k and ./SegSwap/data/pairs_2obj_100k.

Training command can be found in ./SegSwap/train/run.sh.

Note that this command should be able to be launched on a single GPU with 16G memory.

cd train
bash run.sh

Stage 2: hard mining

In train/run_hardmining.sh, replacing --resume-pth by the model trained in the 1st stage, than running:

cd train
bash run_hardmining.sh

5. Acknowledgement

We appreciate helps from :

authors of Patch-NetVLAD who share their top-100 lists on Tokyo247 and Pitts30K with us.
Dr. Relja Arandjelović for providing Tokyo247 and Pitts30K datasets.
public code like Kornia

Part of code is borrowed from our previous projects: ArtMiner and Watermark

6. ChangeLog

21/10/21, model, evaluation + training released

7. License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including Kornia, Pytorch, and uses datasets which each have their own respective licenses that must also be followed.

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Related tags

Overview

SegSwap

Table of Content

1. Installation

1.1. Dependencies

1.2. Pre-trained MocoV2-resnet50 + cross-transformer (~300M)

2. Training Data Generation

2.1. Download COCO (~20G)

2.2. Image Pairs with One Repeated Object

2.2.1 Generating 100k pairs (~18G)

2.2.1 Examples of image pairs

2.2.2 Visualizing correspondences and masks of the generated pairs

2.3. Image Pairs with Two Repeated Object

2.3.1 Generating 100k pairs (~18G)

2.3.1 Examples of image pairs

2.3.2 Visualizing correspondences and masks of the generated pairs

3. Evaluation

3.1 One-shot Art Detail Detection on Brueghel Dataset

3.1.1 Visual results: top-3 retrieved images

3.1.2 Data

3.1.3 Quantitative results

3.2 Place Recognition on Tokyo247 Dataset

3.2.1 Visual results: top-3 retrieved images

3.2.2 Data

3.2.3 Quantitative results

3.3 Place Recognition on Pitts30K Dataset

3.3.1 Visual results: top-3 retrieved images

3.3.2 Data

3.3.3 Quantitative results

3.4 Discovery on Internet Dataset

3.4.1 Visual results

3.4.2 Data

3.4.3 Quantitative results

4. Training

Stage 1: standard training

Stage 2: hard mining

5. Acknowledgement

6. ChangeLog

7. License

Owner

xshen

[ACM MM 2019 Oral] Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Optical machine for senses sensing using speckle and deep learning

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

Artificial Neural network regression model to predict the energy output in a combined cycle power plant.

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"

Python Single Object Tracking Evaluation

Implementation of the state-of-the-art vision transformers with tensorflow

Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

A PyTorch Implementation of the Luna: Linear Unified Nested Attention

Hcaptcha-challenger - Gracefully face hCaptcha challenge with Yolov5(ONNX) embedded solution

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

Deduplicating Training Data Makes Language Models Better

Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.

Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

Safe Policy Optimization with Local Features

Deep Learning Theory

領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。