DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

Last update: Jan 02, 2023

Overview

DECA: Detailed Expression Capture and Animation (SIGGRAPH2021)

input image, aligned reconstruction, animation with various poses & expressions

This is the official Pytorch implementation of DECA.

DECA reconstructs a 3D head model with detailed facial geometry from a single input image. The resulting 3D head model can be easily animated. Please refer to the arXiv paper for more details.

The main features:

Reconstruction: produces head pose, shape, detailed face geometry, and lighting information from a single image.
Animation: animate the face with realistic wrinkle deformations.
Robustness: tested on facial images in unconstrained conditions. Our method is robust to various poses, illuminations and occlusions.
Accurate: state-of-the-art 3D face shape reconstruction on the NoW Challenge benchmark dataset.

Getting Started

Clone the repo:

git clone https://github.com/YadiraF/DECA
cd DECA

Requirements

Python 3.7 (numpy, skimage, scipy, opencv)
PyTorch >= 1.6 (pytorch3d)
face-alignment (Optional for detecting face)
You can run
```
pip install -r requirements.txt
```
Or use virtual environment by runing
```
bash install_conda.sh
```
For visualization, we use our rasterizer that uses pytorch JIT Compiling Extensions. If there occurs a compiling error, you can install pytorch3d instead and set --rasterizer_type=pytorch3d when running the demos.

Usage

Prepare data
a. download FLAME model, choose FLAME 2020 and unzip it, copy 'generic_model.pkl' into ./data
b. download DECA trained model, and put it in ./data (no unzip required)
c. (Optional) follow the instructions for the Albedo model to get 'FLAME_albedo_from_BFM.npz', put it into ./data
Run demos
a. reconstruction
```
python demos/demo_reconstruct.py -i TestSamples/examples --saveDepth True --saveObj True
```
to visualize the predicted 2D landmanks, 3D landmarks (red means non-visible points), coarse geometry, detailed geometry, and depth.

You can also generate an obj file (which can be opened with Meshlab) that includes extracted texture from the input image.
Please run python demos/demo_reconstruct.py --help for more details.

b. expression transfer
```
python demos/demo_transfer.py
```
Given an image, you can reconstruct its 3D face, then animate it by tranfering expressions from other images. Using Meshlab to open the detailed mesh obj file, you can see something like that:

(Thank Soubhik for allowing me to use his face ^_^)
Note that, you need to set '--useTex True' to get full texture.

c. for the teaser gif (reposing and animation)
```
python demos/demo_teaser.py 
```
More demos and training code coming soon.

Evaluation

DECA (ours) achieves 9% lower mean shape reconstruction error on the NoW Challenge dataset compared to the previous state-of-the-art method.
The left figure compares the cumulative error of our approach and other recent methods (RingNet and Deng et al. have nearly identitical performance, so their curves overlap each other). Here we use point-to-surface distance as the error metric, following the NoW Challenge.

For more details of the evaluation, please check our arXiv paper.

Training

Prepare Training Data

a. Download image data
In DECA, we use VGGFace2, BUPT-Balancedface and VoxCeleb2

b. Prepare label
FAN to predict 68 2D landmark
face_segmentation to get skin mask

c. Modify dataloader
Dataloaders for different datasets are in decalib/datasets, use the right path for prepared images and labels.
Download face recognition trained model
We use the model from VGGFace2-pytorch for calculating identity loss, download resnet50_ft, and put it into ./data

Start training

Train from scratch:

python main_train.py --cfg configs/release_version/deca_pretrain.yml 
python main_train.py --cfg configs/release_version/deca_coarse.yml 
python main_train.py --cfg configs/release_version/deca_detail.yml

In the yml files, write the right path for 'output_dir' and 'pretrained_modelpath'.
You can also use released model as pretrained model, then ignor the pretrain step.

Citation

If you find our work useful to your research, please consider citing:

@inproceedings{DECA:Siggraph2021,
  title={Learning an Animatable Detailed {3D} Face Model from In-The-Wild Images},
  author={Feng, Yao and Feng, Haiwen and Black, Michael J. and Bolkart, Timo},
  journal = {ACM Transactions on Graphics, (Proc. SIGGRAPH)}, 
  volume = {40}, 
  number = {8}, 
  year = {2021}, 
  url = {https://doi.org/10.1145/3450626.3459936} 
}

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms in the LICENSE.

Acknowledgements

For functions or scripts that are based on external sources, we acknowledge the origin individually in each file.
Here are some great resources we benefit:

FLAME_PyTorch and TF_FLAME for the FLAME model
Pytorch3D, neural_renderer, SoftRas for rendering
kornia for image/rotation processing
face-alignment for cropping
FAN for landmark detection
face_segmentation for skin mask
VGGFace2-pytorch for identity loss

We would also like to thank other recent public 3D face reconstruction works that allow us to easily perform quantitative and qualitative comparisons :)
RingNet, Deep3DFaceReconstruction, Nonlinear_Face_3DMM, 3DDFA-v2, extreme_3d_faces, facescape

DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

Related tags

Overview

DECA: Detailed Expression Capture and Animation (SIGGRAPH2021)

Getting Started

Requirements

Usage

Evaluation

Training

Citation

License

Acknowledgements

Owner

Yao Feng

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

공공장소에서 눈만 돌리면 CCTV가 보인다는 말이 과언이 아닐 정도로 CCTV가 우리 생활에 깊숙이 자리 잡았습니다.

DziriBERT: a Pre-trained Language Model for the Algerian Dialect

Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

Introduction to CPM

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

MIM: MIM Installs OpenMMLab Packages

Soomvaar is the repo which 🏩 contains different collection of 👨‍💻🚀code in Python and 💫✨Machine 👬🏼 learning algorithms📗📕 that is made during 📃 my practice and learning of ML and Python✨💥

Official repository for Fourier model that can generate periodic signals

Continuum Learning with GEM: Gradient Episodic Memory

Official pytorch implement for “Transformer-Based Source-Free Domain Adaptation”

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

The Official Repository for "Generalized OOD Detection: A Survey"

Recurrent Scale Approximation (RSA) for Object Detection

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Official DGL implementation of "Rethinking High-order Graph Convolutional Networks"

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"