img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Overview

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

License: CC BY-NC 4.0 PWC PWC

Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in silver), aligning it with even the tiniest faces, without face detection or facial landmark localization. Our estimated 3D face locations are rendered by descending distances from the camera, for coherent visualization.

Table of contents

Paper details

Vítor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner, "img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation," arXiv:2012.07791, Dec., 2020

Abstract

We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. We observe that estimating the 6DoF rigid transformation of a face is a simpler problem than facial landmark detection, often used for 3D face alignment. In addition, 6DoF offers more information than face bounding box labels. We leverage these observations to make multiple contributions: (a) We describe an easily trained, efficient, Faster R-CNN--based model which regresses 6DoF pose for all faces in the photo, without preliminary face detection. (b) We explain how pose is converted and kept consistent between the input photo and arbitrary crops created while training and evaluating our model. (c) Finally, we show how face poses can replace detection bounding box training labels. Tests on AFLW2000-3D and BIWI show that our method runs at real-time and outperforms state of the art (SotA) face pose estimators. Remarkably, our method also surpasses SotA models of comparable complexity on the WIDER FACE detection benchmark, despite not been optimized on bounding box labels.

Citation

If you use any part of our code or data, please cite our paper.

@article{albiero2020img2pose,
  title={img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation},
  author={Albiero, Vítor and Chen, Xingyu and Yin, Xi and Pang, Guan and Hassner, Tal},
  journal={arXiv preprint arXiv:2012.07791},
  year={2020}
}

Installation

Install dependecies with Python 3.

pip install -r requirements.txt

Install the renderer, which is used to visualize predictions. The renderer implementation is forked from here.

cd Sim3DR
sh build_sim3dr.sh

Training

Prepare WIDER FACE dataset

First, download our annotations as instructed in Annotations.

Download WIDER FACE dataset and extract to datasets/WIDER_Face.

Then, to create the train and validation files (LMDB), run the following scripts.

python3 convert_json_list_to_lmdb.py
--json_list ./annotations/WIDER_train_annotations.txt
--dataset_path ./datasets/WIDER_Face/WIDER_train/images/
--dest ./datasets/lmdb/
-—train

This first script will generate a LMDB dataset, which contains the training images along with annotations. It will also output a pose mean and std deviation files, which will be used for training and testing.

python3 convert_json_list_to_lmdb.py 
--json_list ./annotations/WIDER_val_annotations.txt 
--dataset_path ./datasets/WIDER_Face/WIDER_val/images/ 
--dest ./datasets/lmdb

This second script will create a LMDB containing the validation images along with annotations.

Train

Once the LMDB train/val files are created, to start training simple run the script below.

CUDA_VISIBLE_DEVICES=0 python3 train.py
--pose_mean ./datasets/lmdb/WIDER_train_annotations_pose_mean.npy
--pose_stddev ./datasets/lmdb/WIDER_train_annotations_pose_stddev.npy
--workspace ./workspace/
--train_source ./datasets/lmdb/WIDER_train_annotations.lmdb
--val_source ./datasets/lmdb/WIDER_val_annotations.lmdb
--prefix trial_1
--batch_size 2
--lr_plateau
--early_stop
--random_flip
--random_crop
--max_size 1400

For now, only single GPU training is tested. Distributed training is partially implemented, PRs welcome.

Testing

To evaluate with the pretrained model, download the model from Model Zoo, and extract it to the main folder. It will create a folder called models, which contains the model weights and the pose mean and std dev that was used for training.

If evaluating with own trained model, change the pose mean and standard deviation to the ones trained with.

Visualizing trained model

To visualize a trained model on the WIDER FACE validation set run the notebook visualize_trained_model_predictions.

WIDER FACE dataset evaluation

If you haven't done already, download the WIDER FACE dataset and extract to datasets/WIDER_Face.

python3 evaluation/evaluate_wider.py 
--dataset_path datasets/WIDER_Face/WIDER_val/images/
--dataset_list datasets/WIDER_Face/wider_face_split/wider_face_val_bbx_gt.txt
--pretrained_path models/img2pose_v1.pth
--output_path results/WIDER_FACE/Val/

To check mAP and plot curves, download the eval tools and point to results/WIDER_FACE/Val.

AFLW2000-3D dataset evaluation

Download the AFLW2000-3D dataset and unzip to datasets/AFLW2000.

Run the notebook aflw_2000_3d_evaluation.

BIWI dataset evaluation

Download the BIWI dataset and unzip to datasets/BIWI.

Run the notebook biwi_evaluation.

Testing on your own images

Run the notebook test_own_images.

Output customization

For every face detected, the model outputs by default:

  • Pose: pitch, yaw, roll, horizontal translation, vertical translation, and scale
  • Projected bounding boxes: left, top, right, bottom
  • Face scores: 0 to 1

Since the projected bounding box without expansion ends at the start of the forehead, we provide a way of expanding the forehead invidually, along with default x and y expansion.

To customize the size of the projected bounding boxes, when creating the model change any of the bounding box expansion variables as shown below (a complete example can be seen at visualize_trained_model_predictions).

# how much to expand in width
bbox_x_factor = 1.1
# how much to expand in height
bbox_y_factor = 1.1
# how much to expand in the forehead
expand_forehead = 0.3

img2pose_model = img2poseModel(
    ...,    
    bbox_x_factor=bbox_x_factor,
    bbox_y_factor=bbox_y_factor,
    expand_forehead=expand_forehead,
)

Align faces

To align the detected faces, call the function bellow passing the reference points, the image with the faces to align, and the poses outputted by img2pose. The function will return a list with PIL images containing one aligned face per give pose.

from utils.pose_operations import align_faces

# load reference points
threed_points = np.load("pose_references/reference_3d_5_points_trans.npy")

aligned_faces = align_faces(threed_points, img, poses)

Resources

Model Zoo

Annotations

Data Zoo

License

Check license for license details.

Owner
Vítor Albiero
Vítor Albiero
Code for "ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on", accepted at WACV 2021 Generation of Human Behavior Workshop.

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on [ Paper ] [ Project Page ] This repository contains the code fo

Andrew Jong 97 Dec 13, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

52 Nov 20, 2022
Advanced Signal Processing Notebooks and Tutorials

Advanced Digital Signal Processing Notebooks and Tutorials Prof. Dr. -Ing. Gerald Schuller Jupyter Notebooks and Videos: Renato Profeta Applied Media

Guitars.AI 115 Dec 13, 2022
The aim of the game, as in the original one, is to find a specific image from a group of different images of a person's face

GUESS WHO Main Links: [Github] [App] Related Links: [CLIP] [Celeba] The aim of the game, as in the original one, is to find a specific image from a gr

Arnau - DIMAI 3 Jan 04, 2022
Benchmarks for the Optimal Power Flow Problem

Power Grid Lib - Optimal Power Flow This benchmark library is curated and maintained by the IEEE PES Task Force on Benchmarks for Validation of Emergi

A Library of IEEE PES Power Grid Benchmarks 207 Dec 08, 2022
Pytorch implementation of ProjectedGAN

ProjectedGAN-pytorch Pytorch implementation of ProjectedGAN (https://arxiv.org/abs/2111.01007) Note: this repository is still under developement. @InP

Dominic Rampas 17 Dec 14, 2022
PyTorch implementation of EfficientNetV2

[NEW!] Check out our latest work involution accepted to CVPR'21 that introduces a new neural operator, other than convolution and self-attention. PyTo

Duo Li 375 Jan 03, 2023
Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Improving Transferability of Representations via Augmentation-Aware Self-Supervision Accepted to NeurIPS 2021 TL;DR: Learning augmentation-aware infor

hankook 38 Sep 16, 2022
kullanışlı ve işinizi kolaylaştıracak bir araç

Hey merhaba! işte çok sorulan sorularının cevabı ve sorunlarının çözümü; Soru= İçinde var denilen birçok şeyi göremiyorum bunun sebebi nedir? Cevap= B

Sexettin 16 Dec 17, 2022
PlenOctrees: NeRF-SH Training & Conversion

PlenOctrees Official Repo: NeRF-SH training and conversion This repository contains code to train NeRF-SH and to extract the PlenOctree, constituting

Alex Yu 323 Dec 29, 2022
PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

1 May 31, 2022
Code for the paper: Adversarial Machine Learning: Bayesian Perspectives

Code for the paper: Adversarial Machine Learning: Bayesian Perspectives This repository contains code for reproducing the experiments in the ** Advers

Roi Naveiro 2 Nov 11, 2022
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
Code for the paper "Learning-Augmented Algorithms for Online Steiner Tree"

Learning-Augmented Algorithms for Online Steiner Tree This is the code for the paper "Learning-Augmented Algorithms for Online Steiner Tree". Requirem

0 Dec 09, 2021
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac

12.9k Jan 09, 2023
Baseline powergrid model for NY

Baseline-powergrid-model-for-NY Table of Contents About The Project Built With Usage License Contact Acknowledgements About The Project As the urgency

Anderson Energy Lab at Cornell 6 Nov 24, 2022
Time series annotation library.

CrowdCurio Time Series Annotator Library The CrowdCurio Time Series Annotation Library implements classification tasks for time series. Features Suppo

CrowdCurio 51 Sep 15, 2022
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models Jonathan Ho, Ajay Jain, Pieter Abbeel Paper: https://arxiv.org/abs/2006.11239 Website: https://hojonathanho.g

Jonathan Ho 1.5k Jan 08, 2023
A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Reinforcement-Learning-Notebooks A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented

Pulkit Khandelwal 1k Dec 28, 2022
Code for Learning to Segment The Tail (LST)

Learning to Segment the Tail [arXiv] In this repository, we release code for Learning to Segment The Tail (LST). The code is directly modified from th

47 Nov 07, 2022