The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Overview

ISC-Track1-Submission

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Required dependencies

To begin with, you should install the following packages with the specified versions in Python, Anaconda. Other versions may work but please do NOT try. For instance, cuda 11.0 has some bugs which bring very bad results. The hardware chosen is Nvidia Tesla V100 and Intel CPU. Other hardware, such as A100, may work but please do NOT try. The stability is not guaranteed, for instance, the Ampere architecture is not suitable and some instability is observed. Please do NOT use AMD CPU, such as EPYC, we observe some instability on DGX server.

  • python 3.7.10
  • pytorch 1.7.1 with cuda 10.1
  • faiss-gpu 1.7.1 with cuda 10.1
  • h5py 3.4.0
  • pandas 1.3.3
  • sklearn 1.0
  • skimage 0.18.3
  • PIL 8.3.2
  • cv2 4.5.3.56
  • numpy 1.16.0
  • torchvision 0.8.2 with cuda 10.1
  • augly 0.1.4
  • selectivesearch 0.4
  • face-recognition 1.3.0 (with dlib of gpu-version)
  • tqdm 4.62.3
  • requests 2.26.0
  • seaborn 0.11.2
  • mkl 2.4.0
  • loguru 0.5.3

Note: Some unimportant packages may be missing, please install them using pip directly when an error occurs.

Pre-trained models

We use three pre-trained models. They are all pre-trained on ImageNet unsupervisedly. To be convenient, we first directly give the pre-trained models as follows, then also the training codes are given.

The first backbone: ResNet-50; The second backbone: ResNet-152; The third backbone: ResNet-50-IBN.

For ResNet-50, we do not pre-train it by ourselves. It is directly downloaded from here. It is supplied by Facebook Research, and the project is Barlow Twins. You should rename it to resnet50_bar.pth.

For ResNet-152 and ResNet-50-IBN, we use the official codes of Momentum2-teacher. We only change the backbone to ResNet-152 and ResNet-50-IBN. It takes about 2 weeks to pre-train the ResNet-152, and 1 week to pre-train the ResNet-50-IBN on 8 V100 GPUs. To be convenient, we supply the whole pre-training codes in the Pretrain folder. The related readme file is also given in that folder.

It should be noted that pre-training processing plays a very important role in our algorithm. Therefore, if you want to reproduce the pre-trained results, please do NOT change the number of GPUs, the batch size, and other related hyper-parameters.

Training

For training, we generate 11 datasets. For each dataset, 3 models with different backbones are trained. Each training takes about/less than 1 day on 4 V100 GPUs (bigger backbone takes longer and smaller backbone takes shorter). The whole training codes, including how to generate training datasets and the link to the generated datasets, are given in the Training folder. For more details, please refer to the readme file in that folder.

Test

To test the performance of the trained model, we perform multi-scale, multi-model, and multi-part testing and ensemble all the scores to get the final score. To be efficient, 33 V100 GPUs are suggested to use. The time for extracting all query images' features using 33 V100 GPUs is about 3 hours. Also extracting and storing training and reference images' features take a lot of time. Please be patient and prepare enough storage to reproduce the testing process. We give all the information to generate our final results in the Test folder. Please reproduce the results according to the readme file in that folder.

Owner
Wenhao Wang
I am a student from Beihang University. My research interests include person re-identification, unsupervised domain adaptation, and domain generalization.
Wenhao Wang
Submodular Subset Selection for Active Domain Adaptation (ICCV 2021)

S3VAADA: Submodular Subset Selection for Virtual Adversarial Active Domain Adaptation ICCV 2021 Harsh Rangwani, Arihant Jain*, Sumukh K Aithal*, R. Ve

Video Analytics Lab -- IISc 13 Dec 28, 2022
A curated list of resources for Image and Video Deblurring

A curated list of resources for Image and Video Deblurring

Subeesh Vasu 1.7k Jan 01, 2023
A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs.

PYGON A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs. Installation This code requires to install and run the graph

Yoram Louzoun's Lab 0 Jun 25, 2021
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
Pytorch implementation of AREL

Status: Archive (code is provided as-is, no updates expected) Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement

8 Nov 25, 2022
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)

A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the

Tirthajyoti Sarkar 223 Dec 05, 2022
Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara 898 Jan 07, 2023
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well i

0 Sep 06, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer.

Unsupervised_IEPGAN This is the PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer. Ha

25 Oct 26, 2022
Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras which will then be used to generate residuals

Federico Lopez 2 Jan 14, 2022
A PyTorch implementation of DenseNet.

A PyTorch Implementation of DenseNet This is a PyTorch implementation of the DenseNet-BC architecture as described in the paper Densely Connected Conv

Brandon Amos 771 Dec 15, 2022
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

ZJU3DV 722 Dec 27, 2022
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
Transformer - Transformer in PyTorch

Transformer 完成进度 Embeddings and PositionalEncoding with example. MultiHeadAttent

Tianyang Li 1 Jan 06, 2022
The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023
Code for "Typilus: Neural Type Hints" PLDI 2020

Typilus A deep learning algorithm for predicting types in Python. Please find a preprint here. This repository contains its implementation (src/) and

47 Nov 08, 2022
Python implementation of ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images, AAAI2022.

ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images Binh M. Le & Simon S. Woo, "ADD:

2 Oct 24, 2022
Recovering Brain Structure Network Using Functional Connectivity

Recovering-Brain-Structure-Network-Using-Functional-Connectivity Framework: Papers: This repository provides a PyTorch implementation of the models ad

5 Nov 30, 2022
A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.

AutoTrader AutoTrader is Python-based platform intended to help in the development, optimisation and deployment of automated trading systems. From sim

Kieran Mackle 485 Jan 09, 2023