Transfer Learning for Pose Estimation of Illustrated Characters

Overview

bizarre-pose-estimator

Transfer Learning for Pose Estimation of Illustrated Characters
Shuhong Chen *, Matthias Zwicker *
WACV2022
[arxiv] [video] [poster] [github]

Human pose information is a critical component in many downstream image processing tasks, such as activity recognition and motion tracking. Likewise, a pose estimator for the illustrated character domain would provide a valuable prior for assistive content creation tasks, such as reference pose retrieval and automatic character animation. But while modern data-driven techniques have substantially improved pose estimation performance on natural images, little work has been done for illustrations. In our work, we bridge this domain gap by efficiently transfer-learning from both domain-specific and task-specific source models. Additionally, we upgrade and expand an existing illustrated pose estimation dataset, and introduce two new datasets for classification and segmentation subtasks. We then apply the resultant state-of-the-art character pose estimator to solve the novel task of pose-guided illustration retrieval. All data, models, and code will be made publicly available.

download

Downloads can be found in this drive folder: wacv2022_bizarre_pose_estimator_release

  • Download bizarre_pose_models.zip and extract to the root project directory; the extracted file structure should merge with the ones in this repo.
  • Download bizarre_pose_dataset.zip and extract to ./_data. The images and annotations should be at ./_data/bizarre_pose_dataset/raw.
  • Download character_bg_seg_data.zip and extract to ./_data. Under ./_data/character_bg_seg, there are bg and fg folders. All foregrounds come from danbooru, and are indexed by the provided csv. While some backgrounds come from danbooru, we use several from jerryli27/pixiv_dataset; these are somewhat hard to download, so we provide the raw pixiv images in the zip.
  • Please refer to Gwern's Danbooru dataset to download danbooru images by ID.

Warning: While NSFW art was filtered out from these data by tag, it was not possible to manually inspect all the data for mislabeled safety ratings. Please use this data at your own risk.

setup

Make a copy of ./_env/machine_config.bashrc.template to ./_env/machine_config.bashrc, and set $PROJECT_DN to the absolute path of this repository folder. The other variables are optional.

This project requires docker with a GPU. Run these lines from the project directory to pull the image and enter a container; note these are bash scripts inside the ./make folder, not make commands. Alternatively, you can build the docker image yourself.

make/docker_pull
make/shell_docker
# OR
make/docker_build
make/shell_docker

danbooru tagging

The danbooru subset used to train the tagger and custom tag rulebook can be found under ./_data/danbooru/_filters. Run this line to tag a sample image:

python3 -m _scripts.danbooru_tagger ./_samples/megumin.png

character background segmentation

Run this line to segment a sample image and extract the bounding box:

python3 -m _scripts.character_segmenter ./_samples/megumin.png

pose estimation

There are several models available in ./_train/character_pose_estim/runs, corresponding to our models at the top of Table 1 in the paper. Run this line to estimate the pose of a sample image, using one of those models:

python3 -m _scripts.pose_estimator \
    ./_samples/megumin.png \
    ./_train/character_pose_estim/runs/feat_concat+data.ckpt

pose-based retrieval

Run this line to estimate the pose of a sample image, and get links to danbooru posts with similar poses:

python3 -m _scripts.pose_retrieval ./_samples/megumin.png

faq

  • Does this work for multiple characters in an image, or images that aren't full-body? Sorry but no, this project is focused just on single full-body characters; however we may release our instance-based models separately.
  • Can I do this without docker? Please use docker, it is very good. If you can't use docker, you can try to replicate the environment from ./_env/Dockerfile, but this is untested.
  • What does bn mean in the files/code? It's sort for "basename", or an ID for a single data sample.
  • What is the sauce for the artwork in ./_samples? Full artist attributions are in the supplementary of our paper, Tables 2 and 3; the retrieval figure is the first two rows of Fig. 2, and Megumin is entry (1,0) of Fig. 3.
  • Which part is best? Part 4.
Owner
Shuhong Chen
Shuhong Chen
Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

The Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more" Arxiv preprint Louay Hazami   ·   Rayhane Mama   ·   Ragavan Thurairatn

Rayhane Mama 144 Dec 23, 2022
Using a Seq2Seq RNN architecture via TensorFlow to predict future Bitcoin prices

Recurrent Bitcoin Network A Data Science Thesis Project About This repository contains the source code for implementing Bitcoin price prediciton using

Frizu 6 Sep 08, 2022
DiSECt: Differentiable Simulator for Robotic Cutting

DiSECt: Differentiable Simulator for Robotic Cutting Website | Paper | Dataset | Video | Blog post DiSECt is a simulator for the cutting of deformable

NVIDIA Research Projects 73 Oct 29, 2022
Codes for the compilation and visualization examples to the HIF vegetation dataset

High-impedance vegetation fault dataset This repository contains the codes that compile the "Vegetation Conduction Ignition Test Report" data, which a

1 Dec 12, 2021
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

Andy Brock 2.1k Dec 29, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 05, 2023
LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference This repository contains PyTorch evaluation code, training code and pretrained

Facebook Research 504 Jan 02, 2023
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022
MMRazor: a model compression toolkit for model slimming and AutoML

Documentation: https://mmrazor.readthedocs.io/ English | 简体中文 Introduction MMRazor is a model compression toolkit for model slimming and AutoML, which

OpenMMLab 899 Jan 02, 2023
Trained on Simulated Data, Tested in the Real World

Trained on Simulated Data, Tested in the Real World

livox 43 Nov 18, 2022
Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

Agustinus Kristiadi 4 Dec 26, 2021
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.

The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store dev

George Rocha 0 Feb 03, 2022
PyTorch implementation of DCT fast weight RNNs

DCT based fast weights This repository contains the official code for the paper: Training and Generating Neural Networks in Compressed Weight Space. T

Kazuki Irie 4 Dec 24, 2022
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

97 Dec 17, 2022
A novel benchmark dataset for Monocular Layout prediction

AutoLay AutoLay: Benchmarking Monocular Layout Estimation Kaustubh Mani, N. Sai Shankar, J. Krishna Murthy, and K. Madhava Krishna Abstract In this pa

Kaustubh Mani 39 Apr 26, 2022
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

John Ingraham 159 Dec 15, 2022
code for paper -- "Seamless Satellite-image Synthesis"

Seamless Satellite-image Synthesis by Jialin Zhu and Tom Kelly. Project site. The code of our models borrows heavily from the BicycleGAN repository an

Light 14 Apr 05, 2022
Deep Learning Specialization by Andrew Ng, deeplearning.ai.

Deep Learning Specialization on Coursera Master Deep Learning, and Break into AI This is my personal projects for the course. The course covers deep l

Engen 1.5k Jan 07, 2023