A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

Overview

faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture.

Updates

Date    Update
2018-08-27     Colab support: A colab notebook for faceswap-GAN v2.2 is provided.
2018-07-25     Data preparation: Add a new notebook for video pre-processing in which MTCNN is used for face detection as well as face alignment.
2018-06-29     Model architecture: faceswap-GAN v2.2 now supports different output resolutions: 64x64, 128x128, and 256x256. Default RESOLUTION = 64 can be changed in the config cell of v2.2 notebook.
2018-06-25     New version: faceswap-GAN v2.2 has been released. The main improvements of v2.2 model are its capability of generating realistic and consistent eye movements (results are shown below, or Ctrl+F for eyes), as well as higher video quality with face alignment.
2018-06-06     Model architecture: Add a self-attention mechanism proposed in SAGAN into V2 GAN model. (Note: There is still no official code release for SAGAN, the implementation in this repo. could be wrong. We'll keep an eye on it.)

Google Colab support

Here is a playground notebook for faceswap-GAN v2.2 on Google Colab. Users can train their own model in the browser.

[Update 2019/10/04] There seems to be import errors in the latest Colab environment due to inconsistent version of packages. Please make sure that the Keras and TensorFlow follow the version number shown in the requirement section below.

Descriptions

faceswap-GAN v2.2

  • FaceSwap_GAN_v2.2_train_test.ipynb

    • Notebook for model training of faceswap-GAN model version 2.2.
    • This notebook also provides code for still image transformation at the bottom.
    • Require additional training images generated through prep_binary_masks.ipynb.
  • FaceSwap_GAN_v2.2_video_conversion.ipynb

    • Notebook for video conversion of faceswap-GAN model version 2.2.
    • Face alignment using 5-points landmarks is introduced to video conversion.
  • prep_binary_masks.ipynb

    • Notebook for training data preprocessing. Output binary masks are save in ./binary_masks/faceA_eyes and ./binary_masks/faceB_eyes folders.
    • Require face_alignment package. (An alternative method for generating binary masks (not requiring face_alignment and dlib packages) can be found in MTCNN_video_face_detection_alignment.ipynb.)
  • MTCNN_video_face_detection_alignment.ipynb

    • This notebook performs face detection/alignment on the input video.
    • Detected faces are saved in ./faces/raw_faces and ./faces/aligned_faces for non-aligned/aligned results respectively.
    • Crude eyes binary masks are also generated and saved in ./faces/binary_masks_eyes. These binary masks can serve as a suboptimal alternative to masks generated through prep_binary_masks.ipynb.

Usage

  1. Run MTCNN_video_face_detection_alignment.ipynb to extract faces from videos. Manually move/rename the aligned face images into ./faceA/ or ./faceB/ folders.
  2. Run prep_binary_masks.ipynb to generate binary masks of training images.
    • You can skip this pre-processing step by (1) setting use_bm_eyes=False in the config cell of the train_test notebook, or (2) use low-quality binary masks generated in step 1.
  3. Run FaceSwap_GAN_v2.2_train_test.ipynb to train models.
  4. Run FaceSwap_GAN_v2.2_video_conversion.ipynb to create videos using the trained models in step 3.

Miscellaneous

Training data format

  • Face images are supposed to be in ./faceA/ or ./faceB/ folder for each taeget respectively.
  • Images will be resized to 256x256 during training.

Generative adversarial networks for face swapping

1. Architecture

enc_arch3d

dec_arch3d

dis_arch3d

2. Results

  • Improved output quality: Adversarial loss improves reconstruction quality of generated images. trump_cage

  • Additional results: This image shows 160 random results generated by v2 GAN with self-attention mechanism (image format: source -> mask -> transformed).

  • Evaluations: Evaluations of the output quality on Trump/Cage dataset can be found here.

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

  • VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face. It also smoothes out artifacts in the segmentation mask, resulting higher output quality.

  • Attention mask: Model predicts an attention mask that helps on handling occlusion, eliminating artifacts, and producing natrual skin tone.

  • Configurable input/output resolution (v2.2): The model supports 64x64, 128x128, and 256x256 outupt resolutions.

  • Face tracking/alignment using MTCNN and Kalman filter in video conversion:

    • MTCNN is introduced for more stable detections and reliable face alignment (FA).
    • Kalman filter smoothen the bounding box positions over frames and eliminate jitter on the swapped face. comp_FA
  • Eyes-aware training: Introduce high reconstruction loss and edge loss in eyes area, which guides the model to generate realistic eyes.

Frequently asked questions and troubleshooting

1. How does it work?

  • The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this. flow_chart

2. Previews look good, but it does not transform to the output videos?

  • Model performs its full potential when the input images are preprocessed with face alignment methods.
    • readme_note001

Requirements

Acknowledgments

Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and reddit user deepfakes' project. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.

Let's create a tool to convert Thailand budget from PDF to CSV.

thailand-budget-pdf2csv Let's create a tool to convert Thailand Government Budgeting from PDF to CSV! รวมพลัง Dev แปลงงบ จาก PDF สู่ Machine-readable

Kao.Geek 88 Dec 19, 2022
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On [Project website] [Dataset] [Video] Abstract We propose a new g

71 Dec 24, 2022
PyTorch Implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedding (ORAL, MICCAIW 2021)

Small Lesion Segmentation in Brain MRIs with Subpixel Embedding PyTorch implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedd

22 Oct 21, 2022
Gans-in-action - Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

GANs in Action by Jakub Langr and Vladimir Bok List of available code: Chapter 2: Colab, Notebook Chapter 3: Notebook Chapter 4: Notebook Chapter 6: C

GANs in Action 914 Dec 21, 2022
JugLab 33 Dec 30, 2022
Generating Videos with Scene Dynamics

Generating Videos with Scene Dynamics This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirs

Carl Vondrick 706 Jan 04, 2023
Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

Hasib Zunair 9 Jun 08, 2022
Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch] Abstract Snapshot compressive imaging (SCI) can rec

integirty 6 Nov 01, 2022
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features

CleanRL (Clean Implementation of RL Algorithms) CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation

Costa Huang 1.8k Jan 01, 2023
Fast and simple implementation of RL algorithms, designed to run fully on GPU.

RSL RL Fast and simple implementation of RL algorithms, designed to run fully on GPU. This code is an evolution of rl-pytorch provided with NVIDIA's I

Robotic Systems Lab - Legged Robotics at ETH Zürich 68 Dec 29, 2022
links and status of cool gradio demos

awesome-demos This is a list of some wonderful demos & applications built with Gradio. Here's how to contribute yours! 🖊️ Natural language processing

Gradio 96 Dec 30, 2022
Real time Human Detection Counting

In this python project, we are going to build the Human Detection and Counting System through Webcam or you can give your own video or images. This is a deep learning project on computer vision, whic

Mir Nawaz Ahmad 2 Jun 17, 2022
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
some classic model used to segment the medical images like CT、X-ray and so on

github_project This is a project for medical image segmentation. This project includes common medical image segmentation models such as U-net, FCN, De

2 Mar 30, 2022
This is a collection of our NAS and Vision Transformer work.

This is a collection of our NAS and Vision Transformer work.

Microsoft 828 Dec 28, 2022
Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching

Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching This is our attempt of the shared task on Quan

Manav Nitin Kapadnis 12 Jul 08, 2022
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments Paper: arXiv (ICRA 2021) Video : https://youtu.be/CC

Sachini Herath 68 Jan 03, 2023
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

30 Dec 24, 2022