The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

Overview

ISC21-Descriptor-Track-1st

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

You can check our solution tech report from: Contrastive Learning with Large Memory Bank and Negative Embedding Subtraction for Accurate Copy Detection

setup

OS

Ubuntu 18.04

CUDA Version

11.1

environment

Run this for python env

conda env create -f environment.yml

data download

mkdir -p input/{query,reference,train}_images
aws s3 cp s3://drivendata-competition-fb-isc-data/all/query_images/ input/query_images/ --recursive --no-sign-request
aws s3 cp s3://drivendata-competition-fb-isc-data/all/reference_images/ input/reference_images/ --recursive --no-sign-request
aws s3 cp s3://drivendata-competition-fb-isc-data/all/train_images/ input/train_images/ --recursive --no-sign-request
aws s3 cp s3://drivendata-competition-fb-isc-data/all/query_images_phase2/ input/query_images_phase2/ --recursive --no-sign-request

train

Run below lines step by step.

cd exp

CUDA_VISIBLE_DEVICES=0,1,2,3 python v83.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 9 \
  --epochs 5 --lr 0.1 --wd 1e-6 --batch-size 128 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 \
  --input-size 256 --sample-size 1000000 --memory-size 20000 \
  ../input/training_images/
CUDA_VISIBLE_DEVICES=0,1,2,3 python v83.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 90 \
  --epochs 10 --lr 0.1 --wd 1e-6 --batch-size 128 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 \
  --input-size 256 --sample-size 1000000 --memory-size 20000 \
  --resume ./v83/train/checkpoint_0004.pth.tar \
  ../input/training_images/

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python v86.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99 \
  --epochs 7 --lr 0.1 --wd 1e-6 --batch-size 128 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 \
  --input-size 384 --sample-size 1000000 --memory-size 20000 --weight ./v83/train/checkpoint_0005.pth.tar \
  ../input/training_images/

python v98.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 999 \
  --epochs 3 --lr 0.1 --wd 1e-6 --batch-size 64 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 --weight ./v86/train/checkpoint_0005.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 20000 \
  ../input/training_images/

python v107.py \
  -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
  --epochs 10 --lr 0.5 --wd 1e-6 --batch-size 16 --ncrops 2 \
  --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
  --input-size 512 --sample-size 1000000 --memory-size 1000 \
  ../input/training_images/

The final model weight can be downloaded from here: https://drive.google.com/file/d/1ySea-NJp_J0aWvma_WmVbc3Hnwf5LHUf/view?usp=sharing You can execute inference code without run training with this model weight. To locate the model weight to suitable location, run following commands after downloaded the model weight.

mkdir -p exp/v107/train
mv checkpoint_009.pth.tar exp/v107/train/

inference

Note that faiss doesn't work with A100, so I used 4x GTX 1080 Ti for post-process.

cd exp

python v107.py -a tf_efficientnetv2_m_in21ft1k --batch-size 128 --mode extract --gem-eval-p 1.0 --weight ./v107/train/checkpoint_0009.pth.tar --input-size 512 --target-set qrt ../input/

# this script generates final prediction result files
python ../scripts/postprocess.py

Submission files are outputted here:

  • exp/v107/extract/v107_iso.h5 # descriptor track
  • exp/v107/extract/v107_iso.csv # matching track

descriptor track local evaluation score:

{
  "average_precision": 0.9479039085717805,
  "recall_p90": 0.9192546583850931
}
Comments
  • Bugs?

    Bugs?

    Congratulations! We really appreciate the work. When I run the

    python v107.py \
      -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
      --epochs 10 --lr 0.5 --wd 1e-6 --batch-size 16 --ncrops 2 \
      --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
      --input-size 512 --sample-size 1000000 --memory-size 1000 \
      ../input/training_images/
    

    I come across

    Traceback (most recent call last):                                              
      File "v107.py", line 774, in <module>
        train(args)
      File "v107.py", line 425, in train
        mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
      File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
        return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
      File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
        while not context.join():
      File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 150, in join
        raise ProcessRaisedException(msg, error_index, failed_process.pid)
    torch.multiprocessing.spawn.ProcessRaisedException: 
    
    -- Process 5 terminated with the following error:
    Traceback (most recent call last):
      File "/home/wangwenhao/anaconda3/envs/ISC/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
        fn(i, *args)
      File "/home/wangwenhao/fbisc-descriptor-1st/exp/v107.py", line 573, in main_worker
        train_one_epoch(train_loader, model, loss_fn, optimizer, scaler, epoch, args)
      File "/home/wangwenhao/fbisc-descriptor-1st/exp/v107.py", line 595, in train_one_epoch
        labels = torch.cat([torch.tile(i, dims=(args.ncrops,)), torch.tensor(j)])
    ValueError: only one element tensors can be converted to Python scalars
    

    Do you know how to fix it? Thanks.

    opened by WangWenhao0716 14
  • data augment is wrong

    data augment is wrong

    train_dataset = ISCDataset(
        train_paths,
        NCropsTransform(
            transforms.Compose(aug_moderate),
            transforms.Compose(aug_hard),
            args.ncrops,
        ),
    )
    

    error log: apply_transform() takes from 2 to 3 positional arguments but 5 were given

    opened by AItechnology 5
  • Cannot load state dict for model

    Cannot load state dict for model

    Thanks for your amazing work. But I encounter a problem, when I use checkpoint_0009.pth.tar checkpoint,

    • When I don't remove model = nn.DataParallel(model), I encouter error:
            size mismatch for module.backbone.bn1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is 
    torch.Size([64]).
            size mismatch for module.backbone.bn1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
            size mismatch for module.backbone.bn1.running_mean: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
            size mismatch for module.backbone.bn1.running_var: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([64]).
            size mismatch for module.fc.weight: copying a param with shape torch.Size([256, 512]) from checkpoint, the shape in current model is torch.Size([256, 2048])
    
    • Then I remove line model = nn.DataParallel(model), the model seems to load checkpoint successfully, but I feed same input to model, the output feature vector if different for different time I run. I guess the model is not loaded successfully when load state dict, so model will use the weight initialized randomly.
    • Then I change strict=True in model.load_state_dict(state_dict=state_dict, strict=False), I encounter error RuntimeError: Error(s) in loading state_dict for ISCNet: Missing key(s) in state_dict:, I found that the key of state_dict in model and checkpoint totally diffrent even name pattern. Key of model state dict and checkpoint state dict I attached below. checkpoint.txt model.txt How can I solve the this problem?
    opened by NguyenThanhAI 2
  • Unable to reproduce Stage 1 results

    Unable to reproduce Stage 1 results

    Hi, I attempted to reproduce the Stage 1 training using your provided code, but was unable to obtain the reported muAP of 0.5831. I instead obtained this result at epoch 9 (indexed from 0):

    Average Precision: 0.49554
    Recall at P90    : 0.32701
    Threshold at P90 : -0.375733
    Recall at rank 1:  0.62448
    Recall at rank 10: 0.65961
    

    I also saw that you continued training from epoch 5, but these are the results I obtained at epoch 5:

    Average Precision: 0.47977
    Recall at P90    : 0.32501
    Threshold at P90 : -0.376619
    Recall at rank 1:  0.61409
    Recall at rank 10: 0.64903
    

    Both sets of results were obtained on the private ground truth set of Phase 1, using image size 512. Is it possible to provide some insight as to what is happening here? Thank you.

    opened by avrilwongaw 1
  • about the train output feature

    about the train output feature

    sorry to bother you again. I want train the model with a small backbone such as resnet50. Because I only have three GPU and I run with command:

    CUDA_VISIBLE_DEVICES=0,1,2 python v83.py  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 9 \
      --epochs 5 --lr 0.1 --wd 1e-6 --batch-size 96 --ncrops 2 \
      --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.0 \
      --input-size 256 --sample-size 1000000 --memory-size 20000 \
    /root/zhx3/data/fb_train_data/train
    

    I find a strange problem. I test checkpoint_000{0..4}.pth.tar model. only the checkpoint_0002.pth.tar ouput different when the input is different. I mean other model will output same embedding no matter what different you input. thanks in advance. the loss log output such as:

    epoch 5:   0%|          | 0/15873 [00:00<?, ?it/s]=> loading checkpoint './v83/train/checkpoint_0004.pth.tar'
    => loaded checkpoint './v83/train/checkpoint_0004.pth.tar' (epoch 5)
    epoch 6:   0%|          | 0/15873 [00:00<?, ?it/s]epoch=5, loss=1.0154363534772417
    epoch 7:   0%|          | 0/15873 [00:00<?, ?it/s]epoch=6, loss=1.012835873522891
    
    opened by Usernamezhx 1
  • about the memory size

    about the memory size

    python v107.py \
      -a tf_efficientnetv2_m_in21ft1k --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --seed 99999 \
      --epochs 10 --lr 0.5 --wd 1e-6 \
      --gem-p 1.0 --pos-margin 0.0 --neg-margin 1.1 --weight ./v98/train/checkpoint_0001.pth.tar \
      --input-size 512 --sample-size 1000000 --memory-size 1000 \
      ../input/training_images/
    

    why not set the --memory-size large such as 20000 ? thanks in advance

    opened by Usernamezhx 1
  • will v107 overfit for phase2?

    will v107 overfit for phase2?

    Congratulations and thanks for your sharing.

    i find v107 only use the about 5k query-ref pair (i.e. gt in phase1) as positive. How to know whether it overfits for phase2 ?

    opened by liangzimei 1
  • access denied for dataset on aws

    access denied for dataset on aws

    Thanks for you work! I have problems downloading the dataset from the given aws buckets

    $ aws s3 cp s3://drivendata-competition-fb-isc-data/all/query_images/ input/query_images/ --recursive --no-sign-request
    fatal error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
    

    Do I need special permissions to download the data?

    opened by sebastianlutter 0
  • Final optimizer state for the model

    Final optimizer state for the model

    Hello @lyakaap

    Thanks a lot for this work. I am trying to take this and finetune over a certain task. Is it possible you can provide the state of final optimizer after 4th stage of training. We want to try an experiment where it will be very useful.

    Thank you.

    opened by shubhamjain0594 11
Owner
lyakaap
Computer Vision, Deep Learning
lyakaap
A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

Emma 1 Jan 18, 2022
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 04, 2021
EfficientMPC - Efficient Model Predictive Control Implementation

efficientMPC Efficient Model Predictive Control Implementation The original algo

Vin 8 Dec 04, 2022
Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable.

Diffrax Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable. Diffrax is a JAX-based library providing numerical differe

Patrick Kidger 717 Jan 09, 2023
Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

Onur Kaplan 54 Nov 21, 2022
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

Dirk Neuhäuser 6 Dec 08, 2022
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classifier')

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography

James 135 Dec 23, 2022
Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021
Implements pytorch code for the Accelerated SGD algorithm.

AccSGD This is the code associated with Accelerated SGD algorithm used in the paper On the insufficiency of existing momentum schemes for Stochastic O

205 Jan 02, 2023
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022
EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

EdMIPS is an efficient algorithm to search the optimal mixed-precision neural network directly without proxy task on ImageNet given computation budgets. It can be applied to many popular network arch

Zhaowei Cai 47 Dec 30, 2022
Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021.

SphereRPN Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021. Authors: Th

Thang Vu 15 Dec 02, 2022
Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

xshen 41 Dec 10, 2022
A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

Data-Annotation-Tool How to Run this Tool? To run this software, follow the steps: git clone https://github.com/Autonomous-Car-Project/Data-Annotation

TiVRA AI 13 Aug 18, 2022
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Omninet - Pytorch Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch. The authors propose that we should be atte

Phil Wang 48 Nov 21, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

Zi Jian Yew 108 Dec 17, 2022