"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Overview

Texformer: 3D Human Texture Estimation from a Single Image with Transformers

This is the official implementation of "3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021 (Oral)

Highlights

  • Texformer: a novel structure combining Transformer and CNN
  • Low-Rank Attention layer (LoRA) with linear complexity
  • Combination of RGB UV map and texture flow
  • Part-style loss
  • Face-structure loss

BibTeX

@inproceedings{xu2021texformer,
  title={{3D} Human Texture Estimation from a Single Image with Transformers},
  author={Xu, Xiangyu and Loy, Chen Change},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  year={2021}
}

Abstract

We propose a Transformer-based framework for 3D human texture estimation from a single image. The proposed Transformer is able to effectively exploit the global information of the input image, overcoming the limitations of existing methods that are solely based on convolutional neural networks. In addition, we also propose a mask-fusion strategy to combine the advantages of the RGB-based and texture-flow-based models. We further introduce a part-style loss to help reconstruct high-fidelity colors without introducing unpleasant artifacts. Extensive experiments demonstrate the effectiveness of the proposed method against state-of-the-art 3D human texture estimation approaches both quantitatively and qualitatively.

Overview

Overview of Texformer

The Query is a pre-computed color encoding of the UV space obtained by mapping the 3D coordinates of a standard human body mesh to the UV space. The Key is a concatenation of the input image and the 2D part-segmentation map. The Value is a concatenation of the input image and its 2D coordinates. We first feed the Query, Key, and Value into three CNNs to transform them into feature space. Then the multi-scale features are sent to the Transformer units to generate the Output features. The multi-scale Output features are processed and fused in another CNN, which produces the RGB UV map T, texture flow F, and fusion mask M. The final UV map is generated by combining T and the textures sampled with F using the fusion mask M. Note that we have skip connections between the same-resolution layers of the CNNs similar to [1] which have been omitted in the figure for brevity.

Visual Results

For each example, the image on the left is the input, and the image on the right is the rendered 3D human, where the human texture is predicted by the proposed Texformer, and the geometry is predicted by RSC-Net.

input1 input1       input1 input1

Install

  • Manage the environment with Anaconda
conda create -n texformer anaconda
conda activate texformer
  • Pytorch-1.4, CUDA-9.2
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=9.2 -c pytorch
  • Install Pytorch-neural-renderer according to the instructions here

Download

  • Download meta data, and put it in "./meta/".

  • Download pretrained model, and put it in "./pretrained".

  • We propose an enhanced Market-1501 dataset, termed as SMPLMarket, by equipping the original data of Market-1501 with SMPL estimation from RSC-Net and body part segmentation estimated by EANet. Please download the SMPLMarket dataset and put it in "./datasets/".

  • Other datasets: PRW, surreal, CUHK-SYSU. Please put these datasets in "./datasets/".

  • All the paths are set in "config.py".

Demo

Run the Texformer with human part segmentation from an off-the-shelf model:

python demo.py --img_path demo_imgs/img.png --seg_path demo_imgs/seg.png

If you don't want to run an external model for human part segmentation, you can use the human part segmentation of RSC-Net instead (note that this may affect the performance as the segmentation of RSC-Net is not very accurate due to the limitation of SMPL):

python demo.py --img_path demo_imgs/img.png

Train

Run the training code with default settings:

python trainer.py --exp_name texformer

Evaluation

Run the evaluation on the SPMLMarket dataset:

python eval.py --checkpoint_path ./pretrained/texformer_ep500.pt

References

[1] "3D Human Pose, Shape and Texture from Low-Resolution Images and Videos", IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

[2] "3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning", ECCV, 2020

[3] "SMPL: A Skinned Multi-Person Linear Model", SIGGRAPH Asia, 2015

[4] "Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising", IEEE Transactions on Image Processing, 2020.

[5] "Learning Factorized Weight Matrix for Joint Filtering", ICML, 2020

Owner
XiangyuXu
XiangyuXu
NeurIPS-2021: Neural Auto-Curricula in Two-Player Zero-Sum Games.

NAC Official PyTorch implementation of NAC from the paper: Neural Auto-Curricula in Two-Player Zero-Sum Games. We release code for: Gradient based ora

Xidong Feng 19 Nov 11, 2022
Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana

DeepGeneAnnotator: A tool to annotate the gene in the genome The master thesis of the "Using deep learning to predict gene structures of the coding ge

Ching-Tien Wang 3 Sep 09, 2022
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to

DMIS Laboratory - Korea University 50 Nov 30, 2022
A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

Quick Notetaker add-on for NVDA The Quick Notetaker add-on is a wonderful tool which allows writing notes quickly and easily anytime and from any app

5 Dec 06, 2022
Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

Location-Aware Generative Adversarial Networks (LAGAN) for Physics Synthesis This repository contains all the code used in L. de Oliveira (@lukedeo),

Deep Learning for HEP 57 Oct 22, 2022
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

This repository contains the official code of OSTAR in "Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift" (ICLR 2022).

Matthieu Kirchmeyer 5 Dec 06, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

beringresearch 285 Jan 04, 2023
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
Keyword-BERT: Keyword-Attentive Deep Semantic Matching

project discription An implementation of the Keyword-BERT model mentioned in my paper Keyword-Attentive Deep Semantic Matching (Plz cite this github r

1 Nov 14, 2021
Learning Energy-Based Models by Diffusion Recovery Likelihood

Learning Energy-Based Models by Diffusion Recovery Likelihood Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma Paper: https://arxiv.o

Ruiqi Gao 41 Nov 22, 2022
Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Contact and Human Dynamics from Monocular Video This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guib

Davis Rempe 207 Jan 05, 2023
The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

Balloon Learning Environment Docs The Balloon Learning Environment (BLE) is a simulator for stratospheric balloons. It is designed as a benchmark envi

Google 87 Dec 25, 2022
Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021) Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song Link to pap

Xinshi Chen 2 Dec 20, 2021
Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

EmotionUI Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI. demo screenshot (with RealSense) required packages Python = 3.6 num

Yang Jiao 2 Dec 23, 2021
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL)

Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL) A preprint version of our paper: Link here This is a samp

Di Zhuang 3 Jan 08, 2023
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) -- xmodel(xilinx

JiaKui Hu 10 Oct 28, 2022