Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

[Video Editing Results]

Requirements

Dependencies

  • Python 3.6
  • PyTorch 1.8
  • Opencv
  • Tensorboard_logger

You can install a new environment for this repo by running

conda env create -f environment.yml
conda activate lattrans 

Prepare StyleGAN2 encoder and generator

  • We use the pretrained StyleGAN2 encoder and generator released from paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. Download and save the official implementation to pixel2style2pixel/ directory. Download and save the pretrained model to pixel2style2pixel/pretrained_models/.

  • In order to save the latent codes to the designed path, we slightly modify pixel2style2pixel/scripts/inference.py.

    # modify run_on_batch()
    if opts.latent_mask is None:
        result_batch = net(inputs, randomize_noise=False, resize=opts.resize_outputs, return_latents=True)
        
    # modify run()
    tic = time.time()
    result_batch, latent_batch = run_on_batch(input_cuda, net, opts) 
    latent_save_path = os.path.join(test_opts.exp_dir, 'latent_code_%05d.npy'%global_i)
    np.save(latent_save_path, latent_batch.cpu().numpy())
    toc = time.time()
    

Training

  • Prepare the training data

    To train the latent transformers, you can download our prepared dataset to the directory data/ and the pretrained latent classifier to the directory models/.

    sh download.sh
    

    You can also prepare your own training data. To achieve that, you need to map your dataset to latent codes using the StyleGAN2 encoder. The corresponding label file is also required. You can continue to use our pretrained latent classifier. If you want to train your own latent classifier on new labels, you can use pretraining/latent_classifier.py.

  • Training

    You can modify the training options of the config file in the directory configs/.

    python train.py --config 001 
    

Testing

Single Attribute Manipulation

Make sure that the latent classifier is downloaded to the directory models/ and the StyleGAN2 encoder is prepared as required. After training your latent transformers, you can use test.py to run the latent transformer for the images in the test directory data/test/. We also provide several pretrained models here (run download.sh to download them). The output images will be saved in the folder outputs/. You can change the desired attribute with --attr.

python test.py --config 001 --attr Eyeglasses --out_path ./outputs/

If you want to test the model on your custom images, you need to first encoder the images to the latent space of StyleGAN using the pretrained encoder.

cd pixel2style2pixel/
python scripts/inference.py \
--checkpoint_path=pretrained_models/psp_ffhq_encode.pt \
--data_path=../data/test/ \
--exp_dir=../data/test/ \
--test_batch_size=1

Sequential Attribute Manipulation

You can reproduce the sequential editing results in the paper using notebooks/figure_sequential_edit.ipynb and the results in the supplementary material using notebooks/figure_supplementary.ipynb.

User Interface

We also provide an interactive visualization notebooks/visu_manipulation.ipynb, where the user can choose the desired attributes for manipulation and define the magnitude of edit for each attribute.

Video Manipulation

Video Result

We provide a script to achieve attribute manipulation for the videos in the test directory data/video/. Please ensure that the StyleGAN2 encoder is prepared as required. You can upload your own video and modify the options in run_video_manip.sh. You can view our video editing results presented in the paper.

sh run_video_manip.sh

Citation

@article{yao2021latent,
  title={A Latent Transformer for Disentangled Face Editing in Images and Videos},
  author={Yao, Xu and Newson, Alasdair and Gousseau, Yann and Hellier, Pierre},
  journal={2021 International Conference on Computer Vision},
  year={2021}
}

License

Copyright © 2021, InterDigital R&D France. All rights reserved.

This source code is made available under the license found in the LICENSE.txt in the root directory of this source tree.

Active and Sample-Efficient Model Evaluation

Active Testing: Sample-Efficient Model Evaluation Hi, good to see you here! đź‘‹ This is code for "Active Testing: Sample-Efficient Model Evaluation". P

Jannik Kossen 19 Oct 30, 2022
The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer

ELSA: Enhanced Local Self-Attention for Vision Transformer By Jingkai Zhou, Pich

DamoCV 87 Dec 19, 2022
Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021. Figure 1: In the process of motion capture (mocap), some joints or even the whole human

Shinny cui 3 Oct 31, 2022
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022
A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

287 Dec 21, 2022
Source code for the BMVC-2021 paper "SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation".

SimReg: A Simple Regression Based Framework for Self-supervised Knowledge Distillation Source code for the paper "SimReg: Regression as a Simple Yet E

9 Oct 15, 2022
End-To-End Optimization of LiDAR Beam Configuration

End-To-End Optimization of LiDAR Beam Configuration arXiv | IEEE Xplore This repository is the official implementation of the paper: End-To-End Optimi

Niclas 30 Nov 28, 2022
Sound Event Detection with FilterAugment

Sound Event Detection with FilterAugment Official implementation of Heavily Augmented Sound Event Detection utilizing Weak Predictions (DCASE2021 Chal

43 Aug 28, 2022
The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks This folder contains the code to reproduce the data in "The Implicit Bias o

Samuel Lippl 0 Feb 05, 2022
Title: Graduate-Admissions-Predictor

The purpose of this project is create a predictive model capable of identifying the probability of a person securing an admit based on their personal profile parameters. Simplified visualisations hav

Akarsh Singh 1 Jan 26, 2022
A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag. Features Connects and logins to a remote CTFd instance. Dumps all

Podalirius 77 Dec 07, 2022
Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets.

Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets. Introduction We propose our dataloader API for loading and

1 Nov 19, 2021
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

419 Jan 03, 2023
Yolo algorithm for detection + centroid tracker to track vehicles

Vehicle Tracking using Centroid tracker Algorithm used : Yolo algorithm for detection + centroid tracker to track vehicles Backend : opencv and python

6 Dec 21, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
"SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image", Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image [Paper] [Website] Pipeline Code Environment pip install -r requirements

VITA 250 Jan 05, 2023
Blender Add-On for slicing meshes with planes

MeshSlicer Blender Add-On for slicing meshes with multiple overlapping planes at once. This is a simple Blender addon to slice a silmple mesh with mul

52 Dec 12, 2022
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las TecnologĂ­as del Lenguaje" (Plan-TL).

Spanish legal domain Language Model ⚖️ This repository contains the page for two main resources for the Spanish legal domain: A RoBERTa model: https:/

Plan de Tecnologías del Lenguaje - Gobierno de España 12 Nov 14, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022