Code for CVPR2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》

Overview

Unequal-Training-for-Deep-Face-Recognition-with-Long-Tailed-Noisy-Data.

This is the code of CVPR 2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》.

arch

Usage Instructions

  1. The code is adopted from InsightFace. I sincerely appreciate for their contributions.

  2. Our method need two stage training, therefore the code is also stepwise. I will be happy if my humble code would help you. If there are questions or issues, please let me know.

Note:

  1. Our method is appropriate for the noisy data with long-tailed distribution such as MF2 training dataset. When the training data is good, like MS1M and VGGFace2, InsightFace is more suitable.

  2. We use the last arcface model (best performance) to find the third type noise. Next we drop the fc weight of the last arcface model, then finetune from it using NR loss (adding a reweight term by putting more confidence in the prediction of the training model).

  3. The second stage training process need very careful manual tuning. We provide our training log for reference.

Prepare the code and the data.

  1. Install MXNet with GPU support (Python 2.7).
pip install mxnet-cu90
  1. download the code as unequal_code/
git clone https://github.com/zhongyy/Unequal-Training-for-Deep-Face-Recognition-with-Long-Tailed-Noisy-Data.git
  1. download the MF2 training dataset(password: w9y5) and the evaluation dataset, then place them in unequal_code/MF2_pic9_head/ unequal_code/MF2_pic9_tail/ and unequal_code/eval_dataset/ respectively.

step 1: Pretrain MF2_pic9_head with ArcFace.

End it when the acc of validation dataset (lfw,cfp-fp and agedb-30) does not ascend.

CUDA_VISIBLE_DEVICES='0,1' python -u train_softmax.py --network r50 --loss-type 4  --margin-m 0.5 --data-dir ./MF2_pic9_head/ --end-epoch 40 --per-batch-size 100 --prefix ../models/r50_arc_pic9/model 2>&1|tee r50_arc_pic9.log

step 2: Train the head data with NRA (finetune from step 1).

  1. Once the model_t,0 is saved, end it.
CUDA_VISIBLE_DEVICES='0,1' python -u train_NR_savemodel.py --network r50 --loss-type 4 --margin-m 0.5 --data-dir ./MF2_pic9_head/ --end-epoch 1 --lr 0.01  --per-batch-size 100 --noise-beta 0.9 --prefix ../models/NRA_r50pic9/model_t --bin-dir ./src/ --pretrained ../models/r50_arc_pic9/model,xx 2>&1|tee NRA_r50pic9_savemodel.log
  1. End it when the acc of validation dataset(lfw, cfp-fp and agedb-30) does not ascend.
CUDA_VISIBLE_DEVICES='0,1' python -u train_NR.py --network r50 --loss-type 4 --margin-m 0.5 --data-dir ./MF2_pic9_head/ --lr 0.01 --lr-steps 50000,90000 --per-batch-size 100 --noise-beta 0.9 --prefix ../models/NRA_r50pic9/model --bin-dir ./src/ --pretrained ../models/NRA_r50pic9/model_t,0 2>&1|tee NRA_r50pic9.log

step 3:

  1. Generate the denoised head data using ./MF2_pic9_head/train.lst and 0_noiselist.txt which has been generated in step 2. (We provide our denoised version(password: w9y5)

  2. Using the denoised head data (have removed the third type noise) and the tail data to continue the second stage training. It's noting that the training process need finetune manually by increase the --interweight gradually. When you change the interweight, you also need change the pretrained model by yourself, because we could not know which is the best model in the last training stage unless we test the model on the target dataset (MF2 test). We always finetune from the best model in the last training stage.

CUDA_VISIBLE_DEVICES='0,1,2,3,4,5,6,7' python -u train_debug_soft_gs.py --network r50 --loss-type 4 --data-dir ./MF2_pic9_head_denoise/ --data-dir-interclass ./MF2_pic9_tail/ --end-epoch 100000 --lr 0.001 --interweight 1 --bag-size 3600 --batch-size1 360 --batchsize_id 360 --batch-size2 40  --pretrained /home/zhongyaoyao/insightface/models/NRA_r50pic9/model,xx --prefix ../models/model_all/model 2>&1|tee all_r50.log
CUDA_VISIBLE_DEVICES='0,1,2,3,4,5,6,7' python -u train_debug_soft_gs.py --network r50 --loss-type 4 --data-dir ./MF2_pic9_head_denoise/ --data-dir-interclass ./MF2_pic9_tail/ --end-epoch 100000 --lr 0.001 --interweight 5 --bag-size 3600 --batch-size1 360 --batchsize_id 360 --batch-size2 40  --pretrained ../models/model_all/model,xx --prefix ../models/model_all/model_s2 2>&1|tee all_r50_s2.log
Owner
Zhong Yaoyao
PhD student in BUPT
Zhong Yaoyao
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Facebook Research 366 Dec 28, 2022
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Kaggle-titanic This is a tutorial in an IPython Notebook for the Kaggle competition, Titanic Machine Learning From Disaster. The goal of this reposito

Andrew Conti 800 Dec 15, 2022
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

[ICLR 2022] Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity by Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elen

VITA 18 Dec 31, 2022
Python scripts using the Mediapipe models for Halloween.

Mediapipe-Halloween-Examples Python scripts using the Mediapipe models for Halloween. WHY Mainly for fun. But this repository also includes useful exa

Ibai Gorordo 23 Jan 06, 2023
Benchmarks for Object Detection in Aerial Images

Benchmarks for Object Detection in Aerial Images

Jian Ding 691 Dec 30, 2022
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search This is the offical implementation of the

SNU ADSL 0 Feb 07, 2022
Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.

Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.

1 Nov 27, 2021
MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics 58 Dec 02, 2022
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

MINDs Lab 104 Nov 29, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

ppg-vc Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC) This repo implements different kinds of PPG-based VC models. Pretrained models. More m

Liu Songxiang 227 Dec 28, 2022
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Bae, Gwangbin 95 Jan 04, 2023
A repository for benchmarking neural vocoders by their quality and speed.

License The majority of VocBench is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Wavenet, Para

Meta Research 177 Dec 12, 2022
Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera

67 Dec 05, 2022
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

770 Jan 02, 2023