SphereFace: Deep Hypersphere Embedding for Face Recognition

Overview

SphereFace: Deep Hypersphere Embedding for Face Recognition

By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song

License

SphereFace is released under the MIT License (refer to the LICENSE file for details).

Update

  • 2018.8.14: We recommand an interesting ECCV 2018 paper that comprehensively evaluates SphereFace (A-Softmax) on current widely used face datasets and their proposed noise-controlled IMDb-Face dataset. Interested users can try to train SphereFace on their IMDb-Face dataset. Take a look here.

  • 2018.5.23: A new SphereFace+ that explicitly enhances the inter-class separability has been introduced in our technical report. Check it out here. Code is released here.

  • 2018.2.1: As requested, the prototxt files for SphereFace-64 are released.

  • 2018.1.27: We updated the appendix of our SphereFace paper with useful experiments and analysis. Take a look here. The content contains:

    • The intuition of removing the last ReLU;
    • Why do we want to normalize the weights other than because we need more geometric interpretation?
    • Empirical experiment of zeroing out the biases;
    • More 2D visualization of A-Softmax loss on MNIST;
    • Angular Fisher score for evaluating the angular feature discriminativeness, which is a new and straightforward evluation metric other than the final accuracy.
    • Experiments of SphereFace on MegaFace with different convolutional layers;
    • The annealing optimization strategy for A-Softmax loss;
    • Details of the 3-patch ensemble strategy in MegaFace challenge;
  • 2018.1.20: We updated some resources to summarize the current advances in angular margin learning. Take a look here.

Contents

  1. Introduction
  2. Citation
  3. Requirements
  4. Installation
  5. Usage
  6. Models
  7. Results
  8. Video Demo
  9. Note
  10. Third-party re-implementation
  11. Resources for angular margin learning

Introduction

The repository contains the entire pipeline (including all the preprocessings) for deep face recognition with SphereFace. The recognition pipeline contains three major steps: face detection, face alignment and face recognition.

SphereFace is a recently proposed face recognition method. It was initially described in an arXiv technical report and then published in CVPR 2017. The most up-to-date paper with more experiments can be found at arXiv or here. To facilitate the face recognition research, we give an example of training on CAISA-WebFace and testing on LFW using the 20-layer CNN architecture described in the paper (i.e. SphereFace-20).

In SphereFace, our network architecures use residual units as building blocks, but are quite different from the standrad ResNets (e.g., BatchNorm is not used, the prelu replaces the relu, different initializations, etc). We proposed 4-layer, 20-layer, 36-layer and 64-layer architectures for face recognition (details can be found in the paper and prototxt files). We provided the 20-layer architecure as an example here. If our proposed architectures also help your research, please consider to cite our paper.

SphereFace achieves the state-of-the-art verification performance (previously No.1) in MegaFace Challenge under the small training set protocol.

Citation

If you find SphereFace useful in your research, please consider to cite:

@InProceedings{Liu_2017_CVPR,
  title = {SphereFace: Deep Hypersphere Embedding for Face Recognition},
  author = {Liu, Weiyang and Wen, Yandong and Yu, Zhiding and Li, Ming and Raj, Bhiksha and Song, Le},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
}

Our another closely-related previous work in ICML'16 (more):

@InProceedings{Liu_2016_ICML,
  title = {Large-Margin Softmax Loss for Convolutional Neural Networks},
  author = {Liu, Weiyang and Wen, Yandong and Yu, Zhiding and Yang, Meng},
  booktitle = {Proceedings of The 33rd International Conference on Machine Learning},
  year = {2016}
}

Requirements

  1. Requirements for Matlab
  2. Requirements for Caffe and matcaffe (see: Caffe installation instructions)
  3. Requirements for MTCNN (see: MTCNN - face detection & alignment) and Pdollar toolbox (see: Piotr's Image & Video Matlab Toolbox).

Installation

  1. Clone the SphereFace repository. We'll call the directory that you cloned SphereFace as SPHEREFACE_ROOT.

    git clone --recursive https://github.com/wy1iu/sphereface.git
  2. Build Caffe and matcaffe

    cd $SPHEREFACE_ROOT/tools/caffe-sphereface
    # Now follow the Caffe installation instructions here:
    # http://caffe.berkeleyvision.org/installation.html
    make all -j8 && make matcaffe

Usage

After successfully completing the installation, you are ready to run all the following experiments.

Part 1: Preprocessing

Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/preprocess/

  1. Download the training set (CASIA-WebFace) and test set (LFW) and place them in data/.

    mv /your_path/CASIA_WebFace  data/
    ./code/get_lfw.sh
    tar xvf data/lfw.tgz -C data/

    Please make sure that the directory of data/ contains two datasets.

  2. Detect faces and facial landmarks in CAISA-WebFace and LFW datasets using MTCNN (see: MTCNN - face detection & alignment).

    # In Matlab Command Window
    run code/face_detect_demo.m

    This will create a file dataList.mat in the directory of result/.

  3. Align faces to a canonical pose using similarity transformation.

    # In Matlab Command Window
    run code/face_align_demo.m

    This will create two folders (CASIA-WebFace-112X96/ and lfw-112X96/) in the directory of result/, containing the aligned face images.

Part 2: Train

Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/train/

  1. Get a list of training images and labels.

    mv ../preprocess/result/CASIA-WebFace-112X96 data/
    # In Matlab Command Window
    run code/get_list.m
    

    The aligned face images in folder CASIA-WebFace-112X96/ are moved from preprocess folder to train folder. A list CASIA-WebFace-112X96.txt is created in the directory of data/ for the subsequent training.

  2. Train the sphereface model.

    ./code/sphereface_train.sh 0,1

    After training, a model sphereface_model_iter_28000.caffemodel and a corresponding log file sphereface_train.log are placed in the directory of result/sphereface/.

Part 3: Test

Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/test/

  1. Get the pair list of LFW (view 2).

    mv ../preprocess/result/lfw-112X96 data/
    ./code/get_pairs.sh

    Make sure that the LFW dataset andpairs.txt in the directory of data/

  2. Extract deep features and test on LFW.

    # In Matlab Command Window
    run code/evaluation.m

    Finally we have the sphereface_model.caffemodel, extracted features pairs.mat in folder result/, and accuracy on LFW like this:

    fold 1 2 3 4 5 6 7 8 9 10 AVE
    ACC 99.33% 99.17% 98.83% 99.50% 99.17% 99.83% 99.17% 98.83% 99.83% 99.33% 99.30%

Models

  1. Visualizations of network architecture (tools from ethereon):
    • SphereFace-20: link
  2. Model file

Results

  1. Following the instruction, we go through the entire pipeline for 5 times. The accuracies on LFW are shown below. Generally, we report the average but we release the model-3 here.

    Experiment #1 #2 #3 (released) #4 #5
    ACC 99.24% 99.20% 99.30% 99.27% 99.13%
  2. Other intermediate results:

Video Demo

SphereFace Demo

Please click the image to watch the Youtube video. For Youku users, click here.

Details:

  1. It is an open-set face recognition scenario. The video is processed frame by frame, following the same pipeline in this repository.
  2. Gallery set consists of 6 identities. Each main character has only 1 gallery face image. All the detected faces are included in probe set.
  3. There is no overlap between gallery set and training set (CASIA-WebFace).
  4. The scores between each probe face and gallery set are computed by cosine similarity. If the maximal score of a probe face is smaller than a pre-definded threshold, the probe face would be considered as an outlier.
  5. Main characters are labeled by boxes with different colors. ( #ff0000Rachel, #ffff00Monica, #ff80ffPhoebe, #00ffffJoey, #0000ffChandler, #00ff00Ross)

Note

  1. Backward gradient

    • In this implementation, we did not strictly follow the equations in paper. Instead, we normalize the scale of gradient. It can be interpreted as a varying strategy for learning rate to help converge more stably. Similar idea and intuition also appear in normalized gradients and projected gradient descent.
    • More specifically, if the original gradient of f w.r.t x can be written as df/dx = coeff_w * w + coeff_x * x, we use the normalized version [df/dx] = (coeff_w * w + coeff_x * x) / norm_wx to perform backward propragation, where norm_wx is sqrt(coeff_w^2 + coeff_x^2). The same operation is also applied to the gradient of f w.r.t w.
    • In fact, you do not necessarily need to use the original gradient, since the original gradient sometimes is not an optimal design. One important criterion for modifying the backprop gradient is that the new "gradient" (strictly speaking, it is not a gradient anymore) need to make the objective value decrease stably and consistently. (In terms of some failure cases for gradient-based back-prop, I recommand a great talk by Shai Shalev-Shwartz)
    • If you use the original gradient to do the backprop, you could still make it work but may need different lambda settings, iteration number and learning rate decay strategy.
  2. Lambda and Note for training (When the loss becomes 87)

  3. According to recent advances, using feature normalization with a tunable scaling parameter s can significantly improve the performance of SphereFace on MegaFace challenge

  4. Difficulties in convergence - When you encounter difficulties in convergence (it may appear if you use SphereFace in another dataset), usually there are a few easy ways to address it.

    • First, try to use large mini-batch size.
    • Second, try to use PReLU instead of ReLU.
    • Third, increase the width and depth of our network.
    • Fourth, try to use better initialization. For example, use the pretrained model from the original softmax loss (it is also equivalent to finetuning).
    • Last and the most effective thing you could try is to change the hyper-parameters for lambda_min, lambda and its decay speed.

Third-party re-implementation

Resources for angular margin learning

L-Softmax loss and SphereFace present a promising framework for angular representation learning, which is shown very effective in deep face recognition. We are super excited that our works has inspired many well-performing methods (and loss functions). We list a few of them for your potential reference (not very up-to-date):

To evaluate the effectiveness of the angular margin learning method, you may consider to use the angular Fisher score proposed in the Appendix E of our SphereFace Paper.

Disclaimer: Some of these methods may not necessarily be inspired by us, but we still list them due to its relevance and excellence.

Contact

Weiyang Liu and Yandong Wen

Questions can also be left as issues in the repository. We will be happy to answer them.

Owner
Weiyang Liu
Weiyang Liu
TransMorph: Transformer for Medical Image Registration

TransMorph: Transformer for Medical Image Registration keywords: Vision Transformer, Swin Transformer, convolutional neural networks, image registrati

Junyu Chen 180 Jan 07, 2023
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

CoTr: Efficient 3D Medical Image Segmentation by bridging CNN and Transformer This is the official pytorch implementation of the CoTr: Paper: CoTr: Ef

218 Dec 25, 2022
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking (SDR) via Contextualized Language Models and Hierarchical Inference This repo is the implementation for SD

Microsoft 36 Nov 28, 2022
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample

Alias-Free-Torch Simple torch module implementation of Alias-Free GAN. This repository including Alias-Free GAN style lowpass sinc filter @filter.py A

이준혁(Junhyeok Lee) 64 Dec 22, 2022
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
A collection of awesome resources image-to-image translation.

awesome image-to-image translation A collection of resources on image-to-image translation. Contributing If you think I have missed out on something (

876 Dec 28, 2022
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 05, 2022
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. Training python train.py --c

Rishikesh (ऋषिकेश) 55 Dec 26, 2022
AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

AK-Shanmugananthan 1 Nov 29, 2021
Zero-Cost Proxies for Lightweight NAS

Zero-Cost-NAS Companion code for the ICLR2021 paper: Zero-Cost Proxies for Lightweight NAS tl;dr A single minibatch of data is used to score neural ne

SamsungLabs 108 Dec 20, 2022
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022
Understanding Convolution for Semantic Segmentation

TuSimple-DUC by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. Introduction This repository is for Under

TuSimple 585 Dec 31, 2022
Monitora la qualità della ricezione dei segnali radio nelle province siciliane.

FMap-server Monitora la qualità della ricezione dei segnali radio nelle province siciliane. Conversion data Frequency - StationName maps are stored in

Triglie 5 May 24, 2021
Semi-Supervised Learning, Object Detection, ICCV2021

End-to-End Semi-Supervised Object Detection with Soft Teacher By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai,

Microsoft 789 Dec 27, 2022
[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

wseg Overview The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast. [arXiv] Though image-level weakly

Ye Du 96 Dec 30, 2022
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022
Free course that takes you from zero to Reinforcement Learning PRO 🦸🏻‍🦸🏽

The Hands-on Reinforcement Learning course 🚀 From zero to HERO 🦸🏻‍🦸🏽 Out of intense complexities, intense simplicities emerge. -- Winston Churchi

Pau Labarta Bajo 260 Dec 28, 2022
Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers. It contains purchases, recurring

Ayodeji Yekeen 1 Jan 01, 2022
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

This repository contains data and code for our EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation. Please contact me at

9 Oct 28, 2022