CVPR2021 Content-Aware GAN Compression

Overview

Content-Aware GAN Compression [ArXiv]

Paper accepted to CVPR2021.

@inproceedings{liu2021content,
  title     = {Content-Aware GAN Compression},
  author    = {Liu, Yuchen and Shu, Zhixin and Li, Yijun and Lin, Zhe and Perazzi, Federico and Kung, S.Y.},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2021},
}

Overview

We propose a novel content-aware approach for GAN compression. With content-awareness, our 11x-accelerated GAN performs comparably with the full-size model on image generation and image editing.

Image Generation

We show an example above on the generative ability of our 11x-accelerated generator vs. the full-size one. In particular, our model generates the interested contents visually comparable to the full-size model.

Image Editing

We show an example typifying the effectiveness of our compressed StyleGAN2 for image style-mixing and morphing above. When we mix middle styles from B, the original full-size model has a significant identity loss, while our approach better preserves the person’s identity. We also observe that our morphed images have a smoother expression transition compared the full-size model in the beard, substantiating our advantage in latent space smoothness.

We provide an additional example above.

Methodology

In our work, we make the first attempt to bring content awareness into channel pruning and knowledge distillation.

Specifically, we leverage a content-parsing network to identify contents of interest (COI), a set of spatial locations with salient semantic concepts, within the generated images. We design a content-aware pruning metric (with a forward and backward path) to remove channels that are least sensitive to COI in the generated images. For knowledge distillation, we focus our distillation region only to COI of the teacher’s outputs which further enhances target contents’ distillation.

Usage

Prerequisite

We have tested our codes under the following environments:

python == 3.6.5
pytorch == 1.6.0
torchvision == 0.7.0
CUDA == 10.2

Pretrained Full-Size Generator Checkpoint

To start with, you can first download a full-size generator checkpoint from:

256px StyleGAN2

1024px StyleGAN2

and place it under the folder ./Model/full_size_model/.

Pruning

Once you get the full-size checkpoint, you can prune the generator by:

python3 prune.py \
	--generated_img_size=256 \
	--ckpt=/path/to/full/size/model/ \
	--remove_ratio=0.7 \
	--info_print

We adopt a uniform channel pruning ratio for every layer. Above procedure will remove 70% of channels from the generator in each layer. The pruned checkpoint will be saved at ./Model/pruned_model/.

Retraining

We then retrain the pruned generator by:

python3 train.py \
	--size=256 \
	--path=/path/to/ffhq/data/folder/ \
	--ckpt=/path/to/pruned/model/ \
	--teacher_ckpt=/path/to/full/size/model/ \
	--iter=450001 \
	--batch_size=16

You may adjust the variables gpu_device_ids and primary_device for the GPU setup in train_hyperparams.py.

Training Log

The time for retraining 11x-compressed models on V100 GPUs:

Model Batch Size Iterations # GPUs Time (Hour)
256px StyleGAN2 16 450k 2 131
1024px StyleGAN2 16 450k 4 251

A typical training curve for the 11x-compressed 256px StyleGAN2:

Evaluation

To evaluate the model quantitatively, we provide get_fid.py and get_ppl.py to get model's FID and PPL sores.

FID Evaluation:

python3 get_fid.py \
	--generated_img_size=256 \
	--ckpt=/path/to/model/ \
	--n_sample=50000 \
	--batch_size=64 \
	--info_print

PPL Evaluation:

python3 get_ppl.py \
	--generated_img_size=256 \
	--ckpt=/path/to/model/ \
	--n_sample=5000 \
	--eps=1e-4 \
	--info_print

We also provide an image projector which return a (real image, projected image) pair in Image_Projection_Visualization.png as well as the PSNR and LPIPS score between this pair:

python3 get_projected_image.py \
	--generated_img_size=256 \
	--ckpt=/path/to/model/ \
	--image_file=/path/to/an/RGB/image/ \
	--num_iters=800 \
	--info_print

An example of Image_Projection_Visualization.png projected by a full-size 256px StyleGAN2:

Helen-Set55

We provide the Helen-Set55 on Google Drive.

11x-Accelerated Generator Checkpoint

We provide the following checkpoints of our content-aware compressed StyleGAN2:

Compressed 256px StyleGAN2

Compressed 1024px StyleGAN2

Acknowledgement

PyTorch StyleGAN2: https://github.com/rosinality/stylegan2-pytorch

Face Parsing BiSeNet: https://github.com/zllrunning/face-parsing.PyTorch

Fréchet Inception Distance: https://github.com/mseitzer/pytorch-fid

Learned Perceptual Image Patch Similarity: https://github.com/richzhang/PerceptualSimilarity

Owner
Yuchen Liu, Ph.D. Candidate at Princeton University
Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

Point-Unet This is an unofficial implementation of the MICCAI 2021 paper Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segment

Namt0d 9 Dec 07, 2022
RLHive: a framework designed to facilitate research in reinforcement learning.

RLHive is a framework designed to facilitate research in reinforcement learning. It provides the components necessary to run a full RL experiment, for both single agent and multi agent environments.

88 Jan 05, 2023
Improving 3D Object Detection with Channel-wise Transformer

"Improving 3D Object Detection with Channel-wise Transformer" Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v

Hualian Sheng 107 Dec 20, 2022
Lama-cleaner: Image inpainting tool powered by LaMa

Lama-cleaner: Image inpainting tool powered by LaMa

Qing 5.8k Jan 05, 2023
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 07, 2023
deep-prae

Deep Probabilistic Accelerated Evaluation (Deep-PrAE) Our work presents an efficient rare event simulation methodology for black box autonomy using Im

Safe AI Lab 4 Apr 17, 2021
AirCode: A Robust Object Encoding Method

AirCode This repo contains source codes for the arXiv preprint "AirCode: A Robust Object Encoding Method" Demo Object matching comparison when the obj

Chen Wang 30 Dec 09, 2022
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
An educational tool to introduce AI planning concepts using mobile manipulator robots.

JEDAI Explains Decision-Making AI Virtual Machine Image The recommended way of using JEDAI is to use pre-configured Virtual Machine image that is avai

Autonomous Agents and Intelligent Robots 13 Nov 15, 2022
MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

This repository is the official PyTorch implementation of Meta-Balance. Find the paper on arxiv MetaBalance: High-Performance Neural Networks for Clas

Arpit Bansal 20 Oct 18, 2021
Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

Streamlit Demo: The Udacity Self-driving Car Image Browser This project demonstrates the Udacity self-driving-car dataset and YOLO object detection in

Streamlit 992 Jan 04, 2023
LineBoard - Python+React+MySQL-白板即時系統改善人群行為

LineBoard-白板即時系統改善人群行為 即時顯示實驗室的使用狀況,並遠端預約排隊,以此來改善人們的工作效率 程式架構 運作流程 使用者先至該實驗室網站預約

Bo-Jyun Huang 1 Feb 22, 2022
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022
DeepRec is a recommendation engine based on TensorFlow.

DeepRec Introduction DeepRec is a recommendation engine based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow. Background Sparse model is a

Alibaba 676 Jan 03, 2023
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 07, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

wseg Overview The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast. [arXiv] Though image-level weakly

Ye Du 96 Dec 30, 2022
Python scripts form performing stereo depth estimation using the HITNET model in ONNX.

ONNX-HITNET-Stereo-Depth-estimation Python scripts form performing stereo depth estimation using the HITNET model in ONNX. Stereo depth estimation on

Ibai Gorordo 30 Nov 08, 2022
Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

Daniel Bogdoll 4 Jul 13, 2022