Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Last update: Nov 22, 2022

Related tags

Deep Learning Pop-Out-Motion

Overview

Pop-Out Motion

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Kyun (T-K) Kim (*: equal contributions)

[Project Page] [Paper] [Video]

We present a framework that can deform an object in a 2D image as it exists in 3D space. While our method leverages 2D-to-3D reconstruction, we argue that reconstruction is not sufficient for realistic deformations due to the vulnerability to topological errors. Thus, we propose to take a supervised learning-based approach to predict the shape Laplacian of the underlying volume of a 3D reconstruction represented as a point cloud. Given the deformation energy calculated using the predicted shape Laplacian and user-defined deformation handles (e.g., keypoints), we obtain bounded biharmonic weights to model plausible handle-based image deformation.

Environment Setup

Clone this repository and install the dependencies specified in requirements.txt.

 git clone https://github.com/jyunlee/Pop-Out-Motion.git
 mv Pop-Out-Motion
 pip install -r requirements.txt

Data Pre-Processing

Training Data

Build executables from the c++ files in data_preprocessing directory. After running the commands below, you should have normalize_bin and calc_l_minv_bin executables.

 cd data_preprocessing
 mkdir build
 cd build
 cmake ..
 make
 cd ..

Clone and build Manifold repository to obtain manifold executable.
Clone and build fTetWild repository to obtain FloatTetwild_bin executable.
Run preprocess_train_data.py to prepare your training data. This should perform (1) shape normalization into a unit bounding sphere, (2) volume mesh conversion, and (3) cotangent Laplacian and inverse mass calculation.

 python preprocess_train_data.py

Test Data

Build executables from the c++ files in data_preprocessing directory. After running the commands below, you should have normalize_bin executable.

 cd data_preprocessing
 mkdir build
 cd build
 cmake ..
 make
 cd ..

Run preprocess_test_data.py to prepare your test data. This should perform (1) shape normalization into a unit bounding sphere and (2) pre-computation of KNN-Based Point Pair Sampling (KPS).

 python preprocess_test_data.py

Network Training

Run network/train.py to train your own Laplacian Learning Network.

 cd network
 python train.py

The pre-trained model on DFAUST dataset is also available here.

Network Inference

Deformation Energy Inference

Given an input image, generate its 3D reconstruction via running PIFu. It is also possible to directly use point cloud data obtained from other sources.
Pre-process the data obtained from Step 1 -- please refer to this section.
Run network/a_inference.py to predict the deformation energy matrix.

 cd network
 python a_inference.py

Handle-Based Deformation Weight Calculation

Build an executable from the c++ file in bbw_calculation directory. After running the commands below, you should have calc_bbw_bin executable.

 cd bbw_calculation
 mkdir build
 cd build
 cmake ..
 make
 cd ..

(Optional) Run sample_pt_handles.py to obtain deformation control handles sampled by farthest point sampling.
Run calc_bbw_bin to calculate handle-based deformation weights using the predicted deformation energy.

./build/calc_bbw_bin <shape_path> <handle_path> <deformation_energy_path> <output_weight_path>

Citation

If you find this work useful, please consider citing our paper.

@InProceedings{lee2022popoutmotion,
    author = {Lee, Jihyun and Sung, Minhyuk and Kim, Hyunjin and Kim, Tae-Kyun},
    title = {Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2022}
}

Acknowledgements

Parts of our data-preprocessing code are adopted from DeepMetaHandles.
Parts of our network code are adopted from Point-Transformer.

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Related tags

Overview

Pop-Out Motion

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Environment Setup

Data Pre-Processing

Training Data

Test Data

Network Training

Network Inference

Citation

Acknowledgements

Owner

Jihyun Lee

A self-supervised 3D representation learning framework named viewpoint bottleneck.

Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.

Neural network for stock price prediction

Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

EfficientNetv2 TensorRT int8

[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Wordplay, an artificial Intelligence based crossword puzzle solver.

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

ColossalAI-Benchmark - Performance benchmarking with ColossalAI

catch-22: CAnonical Time-series CHaracteristics

U-Net: Convolutional Networks for Biomedical Image Segmentation

Code for NeurIPS 2020 article "Contrastive learning of global and local features for medical image segmentation with limited annotations"

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection