D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

Related tags

Deep LearningISC2021
Overview

Facebook AI Image Similarity Challenge: Matching Track —— Team: imgFp

This is the source code of our 3rd place solution to matching track of Image Similarity Challenge (ISC) 2021 organized by Facebook AI. This repo will tell you how to get our result step by step.

Method Overview

For the Matching Track task, we use a global and local dual retrieval method. The global recall model is EsViT, the same as task Descriptor Track. The local recall used SIFT point features. As shown in the figure, our pipeline is divided into four modules. When using an image for query, it is first put into the preprocessing module for overlay detection. Then the global and local features are extracted and retrieved in parallel. There are three recall branches: global recall, original local recall and cropped local recall. The last module will compute the matching score of three branches and merge them into the final result.

method_overview

Installation

Please install python 3.7, Pytorch 1.8 (or higher version) and some packages according to requirements.txt.

gcc version 7.3.1

We run on a 8GPUs (Tesla V100-SXM2-32GB, 32510.5MB), 48CPUs and 300G Memory machine.

Get Result Demo

Now we will describe how to get our result, we use a query image Q24789.jpg as input for demo.

step1: query images preprocess

We train a yolov5 to detect the crop augment in query images. The detils are in README.md of Team: AITechnology in task Descriptor Track. Due to different parameters, we need to preprocess the local recall and global recall respectively.

python preprocessing.py $origin_image_path $save_image_result_path

e.g.
______
cd preprocess
python preprocessing_global.py ../data/queryimages/ ../data/queryimages_crop_global/
python preprocessing_local.py ../data/queryimages/ ../data/queryimages_crop_local/

*note: If Arial.ttf download fails, please copy the local yolov5/Arial.ttf to the specified directory following the command line prompt. cp yolov5/Arial.ttf /root/.config/Ultralytics/Arial.ttf

step2: get original image's local feature

First export the path.

cd local_fea/feature_extract
export LD_LIBRARY_PATH=./extLib/ 

Run the executable program localfea_extract_sift to get the SIFT local point feature, and out to a txt file.

Usage: ./localfea_extract_sift 
    
     
     
      

e.g.
./localfea_extract_sift Q24789 ../../data/queryimages/Q24789.jpg ../feature_out/Q24789.txt

     
    
   

Or you can extract all query images by a list.

python multi_extract_sift.py ../../data/querylist_demo.txt ../../data/queryimages/ ../feature_out/

For example, two point features in a image result txt file are:

Q24789_0_3.1348_65.589_1.76567_-1.09404||0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,13,0,0,0,0,0,0,16,28,7,5,0,0,0,0,0,0,0,0,20,12,0,0,23,5,0,0,29,29,7,12,56,29,5,0,0,11,7,20,38,45,10,0,0,0,0,14,0,0,0,0,39,56,36,8,39,14,0,0,46,56,21,24,56,22,0,0,5,8,8,39,38,11,0,0,0,0,19,47,0,0,0,0,8,56,56,7,37,0,0,0,10,52,56,56,52,0,0,0,0,0,35,56,11,0,0,0,0,0,54,45
Q24789_1_8.26344_431.038_1.75921_1.22328||42,27,0,4,11,12,9,14,49,28,0,6,17,25,18,14,45,37,4,0,12,45,8,9,8,17,9,0,27,50,6,0,41,24,0,0,10,14,19,20,50,34,0,6,20,22,17,21,36,22,4,4,43,50,15,12,26,32,8,0,17,50,17,6,28,12,0,0,0,21,31,21,50,14,0,0,17,31,23,38,19,10,9,17,50,50,14,15,17,23,13,10,19,45,26,8,11,11,0,0,0,6,6,0,28,13,0,0,8,20,12,15,11,9,0,0,24,47,12,9,18,38,22,6,13,28,10,8
...

step3: retrieval use original image local feature

We use the GPU Faiss to retrieval, because there are about 600 million SIFT point features in reference images. They need about 165G GPU Memory for Float16 compute.

Firstly, you need extract all local features of reference images by multi_extract_sift.py and store them in uint8 type to save space. (ref_sift_fea_300.pkl (68G) and ref_sift_name_300.pkl (25G))

Then get original image local recall result:

cd local_fea/faiss_search
python db_search.py ../feature_out/ ../faiss_out/local_pair_result.txt

For example, the result txt file ../faiss_out/local_pair_result.txt:

Q24789.jpg,R540735.jpg

step4: get crop image's local feature (only for part images which have crop result)

Same as step2, but only use the croped image in ../../preprocess/local_crop_list.txt.

cd local_fea/feature_extract
python multi_extract_sift.py ../../preprocess/local_crop_list.txt ../../data/queryimages_crop_local/ ../crop_feature_out/

step5: retrieval use crop image local feature (only for part images which have crop result)

Same as step3:

cd local_fea/faiss_search
python db_search.py ../crop_feature_out/ ../crop_faiss_out/crop_local_pair_result.txt

step6: get image's global feature

We train a EsViT model (follow the rules closely) to extract 256 dims global features, the detils are in README.md of Team: AITechnology in task Descriptor Track.

*note: for global feature, if the image have croped image, we will extract feature use the croped image, else use the origin image.

Generate h5 descriptors for all query images and reference images as submission style:

cd global_fea/feature_extract
python predict_FB_model.py --model checkpoints/EsViT_SwinB_finetune_bs8_lr0.0001_adjustlr_0_margin1.0_dataFB_epoch200.pth  --save_h5_name fb_descriptors_demo.h5  --model_type EsViT_SwinB_W14 --query ./query_list_demo.txt --total ./ref_list_demo.txt

*note: The --query and --total parameters are specified as query list and reference list, respectively.

The h5 file will be saved in ./h5_descriptors/fb_descriptors.h5

step7: retrieval use image's global feature

We have already added our h5 file in phase 1. Use faiss to get top1 pairs.

cd global_fea/faiss_search
python faiss_topk.py ../feature_extract/h5_descriptors/fb_descriptors.h5 ./global_pair_result.txt

step8: compute match score and final result

We use the SIFT feature + KNN-matching (K=2) to compute match point as score. We have already compiled it into an executable program.

Usage: ./match_score 
    
     
      
      

      
     
    
   

For example, to get original image local pairs score:

cd match_score
export LD_LIBRARY_PATH=../local_fea/feature_extract/extLib/
./match_score ../local_fea/faiss_out/local_pair_result.txt ../data/queryimages ../data/referenceimages/ ./local_pair_score.txt

The other two recall pairs are the same:

global: 
./match_score ../global_fea/faiss_search/global_pair_result.txt ../data/queryimages_crop_global ../data/referenceimages/ ./global_pair_score.txt

crop local:
./match_score ../local_fea/crop_faiss_out/crop_local_pair_result.txt ../data/queryimages_crop_local ../data/referenceimages/ ./crop_local_pair_score.txt

Finally, the three recall pairs are merged by:

python merge_score.py ./final_result.txt

Others

If you have any problem or error during running code, please email to us.

Annotate datasets with a semi-trained or fully trained YOLOv5 model

YOLOv5 Auto Annotator Annotate datasets with a semi-trained or fully trained YOLOv5 model Prerequisites Ubuntu =20.04 Python =3.7 System dependencie

Akash James 3 May 14, 2022
Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Demetri Pananos 9 Oct 04, 2022
Taichi Course Homework Template

太极图形课S1-标题部分 这个作业未来或将是你的开源项目,标题的内容可以来自作业中的核心关键词,让读者一眼看出你所完成的工作/做出的好玩demo 如果暂时未想好,起名时可以参考“太极图形课S1-xxx作业” 如下是作业(项目)展开说明的方法,可以帮大家理清思路,并且也对读者非常友好,请小伙伴们多多参

TaichiCourse 30 Nov 19, 2022
Non-Attentive-Tacotron - This is Pytorch Implementation of Google's Non-attentive Tacotron.

Non-attentive Tacotron - PyTorch Implementation This is Pytorch Implementation of Google's Non-attentive Tacotron, text-to-speech system. There is som

Jounghee Kim 46 Dec 19, 2022
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

Xinyi Ying 28 Dec 15, 2022
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou 579 Dec 22, 2022
MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

Mat Leonard 304 Dec 25, 2022
Implementing yolov4 target detection and tracking based on nao robot

Implementing yolov4 target detection and tracking based on nao robot

6 Apr 19, 2022
Histocartography is a framework bringing together AI and Digital Pathology

Documentation | Paper Welcome to the histocartography repository! histocartography is a python-based library designed to facilitate the development of

155 Nov 23, 2022
VIsually-Pivoted Audio and(N) Text

VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled

Yän.PnG 16 Nov 04, 2022
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

LieTorch: Tangent Space Backpropagation Introduction The LieTorch library generalizes PyTorch to 3D transformation groups. Just as torch.Tensor is a m

Princeton Vision & Learning Lab 482 Jan 06, 2023
Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python THIS PROJECT IS CURRENTLY A WORK IN PROGRESS AND THUS THIS REPOSITORY I

Joshua Marshall 14 Dec 31, 2022
Fiddle is a Python-first configuration library particularly well suited to ML applications.

Fiddle Fiddle is a Python-first configuration library particularly well suited to ML applications. Fiddle enables deep configurability of parameters i

Google 227 Dec 26, 2022
wgan, wgan2(improved, gp), infogan, and dcgan implementation in lasagne, keras, pytorch

Generative Adversarial Notebooks Collection of my Generative Adversarial Network implementations Most codes are for python3, most notebooks works on C

tjwei 1.5k Dec 16, 2022
AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation AniGAN: Style-Guided Generative Adversarial Networks for U

Bing Li 81 Dec 14, 2022
😮The official implementation of "CoNeRF: Controllable Neural Radiance Fields" 😮

CoNeRF: Controllable Neural Radiance Fields This is the official implementation for "CoNeRF: Controllable Neural Radiance Fields" Project Page Paper V

Kacper Kania 61 Dec 24, 2022
Encode and decode text application

Text Encoder and Decoder Encode and decode text in many ways using this application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1

Alice 1 Feb 12, 2022
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation

CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation We propose a novel approach to translate unpaired contrast computed

Nicolae Catalin Ristea 13 Jan 02, 2023
Unified MultiWOZ evaluation scripts for the context-to-response task.

MultiWOZ Context-to-Response Evaluation Standardized and easy to use Inform, Success, BLEU ~ See the paper ~ Easy-to-use scripts for standardized eval

Tomáš Nekvinda 38 Dec 13, 2022