Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

Overview

DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral)

Project | Paper

Official PyTorch implementation of the paper: "DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample".

DeepSIM: Given a single real training image (b) and a corresponding primitive representation (a), our model learns to map between the primitive (a) to the target image (b). At inference, the original primitive (a) is manipulated by the user. Then, the manipulated primitive is passed through the network which outputs a corresponding manipulated image (e) in the real image domain.


DeepSIM was trained on a single training pair, shown to the left of each sample. First row "face" output- (left) flipping eyebrows, (right) lifting nose. Second row "dog" output- changing shape of dog's hat, removing ribbon, and making face longer. Second row "car" output- (top) adding wheel, (bottom) conversion to sports car.


DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample
Yael Vinker*, Eliahu Horwitz*, Nir Zabari, Yedid Hoshen
*Equal contribution
https://arxiv.org/pdf/2007.01289

Abstract: We present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline (TPS) as an effective augmentation. Our network learns to map between a primitive representation of the image to the image itself. The choice of a primitive representation has an impact on the ease and expressiveness of the manipulations and can be automatic (e.g. edges), manual (e.g. segmentation) or hybrid such as edges on top of segmentations. At manipulation time, our generator allows for making complex image changes by modifying the primitive input representation and mapping it through the network. Our method is shown to achieve remarkable performance on image manipulation tasks.

Getting Started

Setup

  1. Clone the repo:
git clone https://github.com/eliahuhorwitz/DeepSIM.git
cd DeepSIM
  1. Create a new environment and install the libraries:
python3.7 -m venv deepsim_venv
source deepsim_venv/bin/activate
pip install -r requirements.txt


Training

The input primitive used for training should be specified using --primitive and can be one of the following:

  1. "seg" - train using segmentation only
  2. "edges" - train using edges only
  3. "seg_edges" - train using a combination of edges and segmentation
  4. "manual" - could be anything (for example, a painting)

For the chosen option, a suitable input file should be provided under /"train_" (e.g. ./datasets/car/train_seg). For automatic edges, you can leave the "train_edges" folder empty, and an edge map will be generated automatically. Note that for the segmentation primitive option, you must verify that the input at test time fits exactly the input at train time in terms of colors.

To train on CPU please specify --gpu_ids '-1'.

  • Train DeepSIM on the "face" video using both edges and segmentations (bash ./scripts/train_face_vid_seg_edges.sh):
#!./scripts/train_face_vid_seg_edges.sh
python3.7 ./train.py --dataroot ./datasets/face_video --primitive seg_edges --no_instance --tps_aug 1 --name DeepSIMFaceVideo
  • Train DeepSIM on the "car" image using segmentation only (bash ./scripts/train_car_seg.sh):
#!./scripts/train_car_seg.sh
python3.7 ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
  • Train DeepSIM on the "face" image using edges only (bash ./scripts/train_face_edges.sh):
#!./scripts/train_face_edges.sh
python3.7 ./train.py --dataroot ./datasets/face --primitive edges --no_instance --tps_aug 1 --name DeepSIMFace

Testing

  • Test DeepSIM on the "face" video using both edges and segmentations (bash ./scripts/test_face_vid_seg_edges.sh):
#!./scripts/test_face_vid_seg_edges.sh
python3.7 ./test.py --dataroot ./datasets/face_video --primitive seg_edges --phase "test" --no_instance --name DeepSIMFaceVideo --vid_mode 1 --test_canny_sigma 0.5
  • Test DeepSIM on the "car" image using segmentation only (bash ./scripts/test_car_seg.sh):
#!./scripts/test_car_seg.sh
python3.7 ./test.py --dataroot ./datasets/car --primitive seg --phase "test" --no_instance --name DeepSIMCar
  • Test DeepSIM on the "face" image using edges only (bash ./scripts/test_face_edges.sh):
#!./scripts/test_face_edges.sh
python3.7 ./test.py --dataroot ./datasets/face --primitive edges --phase "test" --no_instance --name DeepSIMFace

Additional Augmentations

As shown in the supplementary, adding augmentations on top of TPS may lead to better results

  • Train DeepSIM on the "face" video using both edges and segmentations with sheer, rotations, "cutmix", and canny sigma augmentations (bash ./scripts/train_face_vid_seg_edges_all_augmentations.sh):
#!./scripts/train_face_vid_seg_edges_all_augmentations.sh
python3.7 ./train.py --dataroot ./datasets/face_video --primitive seg_edges --no_instance --tps_aug 1 --name DeepSIMFaceVideoAugmentations --cutmix_aug 1 --affine_aug "shearx_sheary_rotation" --canny_aug 1
  • When using edges or seg_edges, it may be beneficial to have white edges instead of black ones, to do so add the --canny_color 1 option
  • Check ./options/base_options.py for more augmentation related settings
  • When using edges or seg_edges and adding edges manually at test time, it may be beneficial to apply "skeletonize" (e.g skimage skeletonize )on the edges in order for them to resemble the canny edges

More Results

Top row - primitive images. Left - original pair used for training. Center- switching the positions between the two rightmost cars. Right- removing the leftmost car and inpainting the background.


The leftmost column shows the source image, then each column demonstrate the result of our model when trained on the specified primitive. We manipulated the image primitives, adding a right eye, changing the point of view and shortening the beak. Our results are presented next to each manipulated primitive. The combined primitive performed best on high-level changes (e.g. the eye), and low-level changes (e.g. the background).


On the left is the training image pair, in the middle are the manipulated primitives and on the right are the manipulated outputs- left to right: dress length, strapless, wrap around the neck.

Single Image Animation

Animation to Video

Video to Animation

Citation

If you find this useful for your research, please use the following.

@InProceedings{Vinker_2021_ICCV,
    author    = {Vinker, Yael and Horwitz, Eliahu and Zabari, Nir and Hoshen, Yedid},
    title     = {Image Shape Manipulation From a Single Augmented Training Sample},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {13769-13778}
}

Acknowledgments

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

2 Aug 05, 2022
Official implementation of the Implicit Behavioral Cloning (IBC) algorithm

Implicit Behavioral Cloning This codebase contains the official implementation of the Implicit Behavioral Cloning (IBC) algorithm from our paper: Impl

Google Research 210 Dec 09, 2022
TrackTech: Real-time tracking of subjects and objects on multiple cameras

TrackTech: Real-time tracking of subjects and objects on multiple cameras This project is part of the 2021 spring bachelor final project of the Bachel

5 Jun 17, 2022
The authors' official PyTorch SigWGAN implementation

The authors' official PyTorch SigWGAN implementation This repository is the official implementation of [Sig-Wasserstein GANs for Time Series Generatio

9 Jun 16, 2022
Stochastic gradient descent with model building

Stochastic Model Building (SMB) This repository includes a new fast and robust stochastic optimization algorithm for training deep learning models. Th

S. Ilker Birbil 22 Jan 19, 2022
Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Empathetic Response Generation with State Management" this repository is maintained by both Jun Gao and Yuhan Liu Model Ove

Yuhan Liu 3 Oct 08, 2022
Pytorch implementation of Hinton's Dynamic Routing Between Capsules

pytorch-capsule A Pytorch implementation of Hinton's "Dynamic Routing Between Capsules". https://arxiv.org/pdf/1710.09829.pdf Thanks to @naturomics fo

Tim Omernick 625 Oct 27, 2022
Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization Code for reproducing our results in the Head2Toe paper. Paper: arxiv.or

Google Research 62 Dec 12, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 05, 2023
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
N-Person-Check-Checker-Splitter - A calculator app use to divide checks

N-Person-Check-Checker-Splitter This is my from-scratch programmed calculator ap

2 Feb 15, 2022
Using python and scikit-learn to make stock predictions

MachineLearningStocks in python: a starter project and guide EDIT as of Feb 2021: MachineLearningStocks is no longer actively maintained MachineLearni

Robert Martin 1.3k Dec 29, 2022
Focal Loss for Dense Rotation Object Detection

Convert ResNets weights from GluonCV to Tensorflow Abstract GluonCV released some new resnet pre-training weights and designed some new resnets (such

17 Nov 24, 2021
Self-Supervised CNN-GCN Autoencoder

GCNDepth Self-Supervised CNN-GCN Autoencoder GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network To be published

53 Dec 14, 2022
Self Governing Neural Networks (SGNN): the Projection Layer

Self Governing Neural Networks (SGNN): the Projection Layer A SGNN's word projections preprocessing pipeline in scikit-learn In this notebook, we'll u

Guillaume Chevalier 22 Nov 06, 2022
Tweesent-back - Tweesent backend uses fastAPI as the web framework

TweeSent Backend Tweesent backend. This repo uses fastAPI as the web framework.

0 Mar 26, 2022
Contains code for Deep Kernelized Dense Geometric Matching

DKM - Deep Kernelized Dense Geometric Matching Contains code for Deep Kernelized Dense Geometric Matching We provide pretrained models and code for ev

Johan Edstedt 83 Dec 23, 2022
Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network This repository is the implementation of ACE-HGNN in PyTorch. Environment pyt

9 Nov 28, 2022
OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

Yoga Pose Identification and Icon Matching Project Goal Detect yoga poses performed by a user and overlay a corresponding icon image. Running the main

Anna Garverick 1 Dec 03, 2021