Google Landmark Recogntion and Retrieval 2021 Solutions

Overview

Google Landmark Recogntion and Retrieval 2021 Solutions

In this repository you can find solution and code for Google Landmark Recognition 2021 and Google Landmark Retrieval 2021 competitions (both in top-100).

Brief Summary

My solution is based on the latest modeling from the previous competition and strong post-processing based on re-ranking and using side models like detectors. I used single RTX 3080, EfficientNet B0 and only competition data for training.

Model and loss function

I used the same model and loss as the winner team of the previous competition as a base. Since I had only single RTX 3080, I hadn't enough time to experiment with that and change it. The only things I managed to test is Subcenter ArcMarginProduct as the last block of model and ArcFaceLossAdaptiveMargin loss function, which has been used by the 2nd place team in the previous year. Both those things gave me a signifact score boost (around 4% on CV and 5% on LB).

Setting up the training and validation

Optimizing and scheduling

Optimizer - Ranger (lr=0.003)
Scheduler - CosineAnnealingLR (T_max=12) + 1 epoch Warm-Up

Training stages

I found the best perfomance in training for 15 epochs and 5 stages:

  1. (1-3) - Resize to image size, Horizontal Flip
  2. (4-6) - Resize to bigger image size, Random Crop to image size, Horizontal Flip
  3. (7-9) - Resize to bigger image size, Random Crop to image size, Horizontal Flip, Coarse Dropout with one big square (CutMix)
  4. (10-12) - Resize to bigger image size, Random Crop to image size, Horizontal Flip, FMix, CutMix, MixUp
  5. (13-15) - Resize to bigger image size, Random Crop to image size, Horizontal Flip

I used default Normalization on all the epochs.

Validation scheme

Since I hadn't enough hardware, this became my first competition where I wasn't able to use a K-fold validation, but at least I saw stable CV and CV/LB correlation at the previous competitions, so I used simple stratified train-test split in 0.8, 0.2 ratio. I also oversampled all the samples up to 5 for each class.

Inference and Post-Processing:

  1. Change class to non-landmark if it was predicted more than 20 times .
  2. Using pretrained YoloV5 for detecting non-landmark images. All classes are used, boxes with confidence < 0.5 are dropped. If total area of boxes is greater than total_image_area / 2.7, the sample is marked as non-landmark. I tried to use YoloV5 for cleaning the train dataset as well, but it only decreased a score.
  3. Tuned post-processing from this paper, based on the cosine similarity between train and test images to non-landmark ones.
  4. Higher image size for extracting embeddings on inference.
  5. Also using public train dataset as an external data for extracting embeddings.

Didn't work for me

  • Knowledge Distillation
  • Resnet architectures (on average they were worse than effnets)
  • Adding an external non-landmark class to training from 2019 test dataset
  • Train binary non-landmark classifier

Transfer Learning on the full dataset and Label Smoothing should be useful here, but I didn't have time to test it.

Owner
Vadim Timakin
17 y.o Machine Learning Engineer | Kaggle Competitions Expert | ML/DL/CV | PyTorch
Vadim Timakin
Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. ๐Ÿ”ฅ

ElegantRL โ€œๅฐ้›…โ€: Scalable and Elastic Deep Reinforcement Learning ElegantRL is developed for researchers and practitioners with the following advantage

AI4Finance Foundation 2.5k Jan 05, 2023
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection arXi

59 Nov 29, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
Implementation of the final project of the course DDA6309 Probabilistic Graphical Model

Task-aware Joint CWS and POS (TCwsPos) This is the implementation of the final project of the course DDA6309 Probabilistic Graphical Models, The Chine

Peng 1 Dec 26, 2021
Neural Oblivious Decision Ensembles

Neural Oblivious Decision Ensembles A supplementary code for anonymous ICLR 2020 submission. What does it do? It learns deep ensembles of oblivious di

25 Sep 21, 2022
Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Hierarchical Skills for Efficient Exploration This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

Facebook Research 38 Dec 06, 2022
LAMDA: Label Matching Deep Domain Adaptation

LAMDA: Label Matching Deep Domain Adaptation This is the implementation of the paper LAMDA: Label Matching Deep Domain Adaptation which has been accep

Tuan Nguyen 9 Sep 06, 2022
InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision This is an official PyTorch implementation of the InsCLR paper. Download Dataset Dataset Im

Zelu Deng 25 Aug 30, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 07, 2022
DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

sunset 709 Dec 31, 2022
v objective diffusion inference code for JAX.

v-diffusion-jax v objective diffusion inference code for JAX, by Katherine Crowson (@RiversHaveWings) and Chainbreakers AI (@jd_pressman). The models

Katherine Crowson 186 Dec 21, 2022
EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

EdMIPS is an efficient algorithm to search the optimal mixed-precision neural network directly without proxy task on ImageNet given computation budgets. It can be applied to many popular network arch

Zhaowei Cai 47 Dec 30, 2022
Anagram Generator in Python

Anagrams Generator This is a program for computing multiword anagrams. It makes no effort to come up with sentences that make sense; it only finds ana

Day Fundora 5 Nov 17, 2022
Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Sc

2 Dec 26, 2021
Localizing Visual Sounds the Hard Way

Localizing-Visual-Sounds-the-Hard-Way Code and Dataset for "Localizing Visual Sounds the Hard Way". The repo contains code and our pre-trained model.

Honglie Chen 58 Dec 07, 2022
FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack Case study of the FCA. The code can be find in FCA. Cas

IDRL 21 Dec 15, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

้กพๅฎ‡่ถ… 77 Dec 02, 2022
Ipython notebook presentations for getting starting with basic programming, statistics and machine learning techniques

Data Science 45-min Intros Every week*, our data science team @Gnip (aka @TwitterBoulder) gets together for about 50 minutes to learn something. While

Scott Hendrickson 1.6k Dec 31, 2022
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"

GINC small-scale in-context learning dataset GINC (Generative In-Context learning Dataset) is a small-scale synthetic dataset for studying in-context

P-Lambda 29 Dec 19, 2022
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

On-Device AI Co., Ltd. 7 Apr 05, 2022