Python library for tracking human heads with FLAME (a 3D morphable head model)

Overview

Video Head Tracker

Teaser image

3D tracking library for human heads based on FLAME (a 3D morphable head model). The tracking algorithm is inspired by face2face. It determines FLAMEs shape and texture parameters as well as spherical harmonics lights and camera intrinsics for a video sequence. Afterwards, expressions and poses (rigid, neck, jaw, eyes) are optimized for each frame of the video. The only inputs are an RGB video together with facial and iris landmarks. The latter is estimated by our code automatically.

This repository complements the code release of the CVPR2022 paper Neural Head Avatars from Monocular RGB Videos. The code is maintained independently from the paper's code to ease reusing it in other projects.

Installation

  • Install Python 3.9 (it should work with other versions as well, but the setup.py and dependencies must be adjusted to do so).
  • Clone the repo and run pip install -e . from inside the cloned directory.
  • Download the flame head model and texture space from the from the official website and add them as generic_model.pkl and FLAME_texture.npz under ./assets/flame.
  • Finally, go to https://github.com/HavenFeng/photometric_optimization and copy the uv parametrization head_template_mesh.obj of FLAME found there to ./assets/flame, as well.

Usage

To run the tracker on a video run

python vht/optimize_tracking.py --config your_config.ini --video path_to_video --data_path path_to_data

The video path and data path can also be given inside the config file. In general, all parameters in the config file may be overwritten by providing them on the command line explicitly. If a video path is given, the video will be extracted and facial + iris landmarks are predicted for each frame. The frames and landmarks are stored at --data_path. Once extracted, you can reuse them by not passing the --video flag anymore. We provide config file for two identities tracked in the main paper. The video data for these subjects can be downloaded from the paper repository. These configs provide good defaults for other videos, as well.

If you would like to use your own videos, the following parameters are most important to set:

[dataset]
data_path = PATH_TO_DATASET --> discussed above

[training]
output_path = OUTPUT_PATH --> where the results will be stored
keyframes = [90, 415, 434, 193] --> list of frames used to optimize shape, texture, lights and camera
                                --> ideally, you provide one front, one left and one right view

The optimized parameters are stored in the output directory as tracked_flame_params.npz.

License

The code is available for non-commercial scientific research purposes under the CC BY-NC 3.0 license. Please note that the files flame.py and lbs.py are heavily inspired by https://github.com/HavenFeng/photometric_optimization and are property of the Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. The download, use, and distribution of this code is subject to this license. The files that can be found in the ./assets directory, are adapted from the FLAME head model for which the license can be found here.

Citation

If you find our work useful, please include the following citation:

@article{grassal2021neural,
  title={Neural Head Avatars from Monocular RGB Videos},
  author={Grassal, Philip-William and Prinzler, Malte and Leistner, Titus and Rother, Carsten
          and Nie{\ss}ner, Matthias and Thies, Justus},
  journal={arXiv preprint arXiv:2112.01554},
  year={2021}
}

Acknowledgements

This project has received funding from the DFG in the joint German-Japan-France grant agreement (RO 4804/3-1) and the ERC Starting Grant Scan2CAD (804724). We also thank the Center for Information Services and High Performance Computing (ZIH) at TU Dresden for generous allocations of computer time.

Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL)

Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL) This repository contains all source code used to generate the results in the article "

Charlotte Loh 3 Jul 23, 2022
An NVDA add-on to split screen reader and audio from other programs to different sound channels

An NVDA add-on to split screen reader and audio from other programs to different sound channels (add-on idea credit: Tony Malykh)

Joseph Lee 7 Dec 25, 2022
Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Child-Tuning Source code for EMNLP 2021 Long paper: Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning. 1. Environ

46 Dec 12, 2022
An open source python library for automated feature engineering

"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to

alteryx 6.4k Jan 03, 2023
MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

MultiMix This repository contains the implementation of MultiMix. Our publications for this project are listed below: "MultiMix: Sparingly Supervised,

Ayaan Haque 27 Dec 22, 2022
A Python module for the generation and training of an entry-level feedforward neural network.

ff-neural-network A Python module for the generation and training of an entry-level feedforward neural network. This repository serves as a repurposin

Riadh 2 Jan 31, 2022
SberSwap Video Swap base on deep learning

SberSwap Video Swap base on deep learning

Sber AI 431 Jan 03, 2023
keyframes-CNN-RNN(action recognition)

keyframes-CNN-RNN(action recognition) Environment: python=3.7 pytorch=1.2 Datasets: Following the format of UCF101 action recognition. Run steps: Mo

4 Feb 09, 2022
MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets)

MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets) Using mixup data augmentation as reguliraztion and tuning the hyper par

Bhanu 2 Jan 16, 2022
Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 03, 2023
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm

100 Dec 28, 2022
Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

arXiv Dual Contrastive Learning Adversarial Generative Networks (DCLGAN) We provide our PyTorch implementation of DCLGAN, which is a simple yet powerf

119 Dec 04, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models Code accompanying CVPR'20 paper of the same title. Paper lin

Alex Damian 7k Dec 30, 2022
Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

ML² AT CILVR 18 Nov 18, 2022
Adaptive, interpretable wavelets across domains (NeurIPS 2021)

Adaptive wavelets Wavelets which adapt given data (and optionally a pre-trained model). This yields models which are faster, more compressible, and mo

Yu Group 50 Dec 16, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
Code for binary and multiclass model change active learning, with spectral truncation implementation.

Model Change Active Learning Paper (To Appear) Python code for doing active learning in graph-based semi-supervised learning (GBSSL) paradigm. Impleme

Kevin Miller 1 Jul 24, 2022
Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Recurrent Neural Networks Implements simple recurrent network and a stacked recurrent network in numpy and torch respectively. Both flavours implement

Vishal R 1 Nov 16, 2021