[CVPR 2021] Teachers Do More Than Teach: Compressing Image-to-Image Models (CAT)

Overview

CAT

arXiv

Pytorch implementation of our method for compressing image-to-image models.
Teachers Do More Than Teach: Compressing Image-to-Image Models
Qing Jin1, Jian Ren2, Oliver J. Woodford, Jiazhuo Wang2, Geng Yuan1, Yanzhi Wang1, Sergey Tulyakov2
1Northeastern University, 2Snap Inc.
In CVPR 2021.

Overview

Compression And Teaching (CAT) framework for compressing image-to-image models: ① Given a pre-trained teacher generator Gt, we determine the architecture of a compressed student generator Gs by eliminating those channels with smallest magnitudes of batch norm scaling factors. ② We then distill knowledge from the pretrained teacher Gt on the student Gs via a novel distillation technique, which maximize the similarity between features of both generators, defined in terms of kernel alignment (KA).

Prerequisites

  • Linux
  • Python 3
  • CPU or NVIDIA GPU + CUDA CuDNN

Getting Started

Installation

  • Clone this repo:

    git clone [email protected]:snap-research/CAT.git
    cd CAT
  • Install PyTorch 1.7 and other dependencies (e.g., torchvision).

    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, please create a new Conda environment using conda env create -f environment.yml.

Data Preparation

CycleGAN

Setup

  • Download the CycleGAN dataset (e.g., horse2zebra).

    bash datasets/download_cyclegan_dataset.sh horse2zebra
  • Get the statistical information for the ground-truth images for your dataset to compute FID. We provide pre-prepared real statistic information for several datasets on Google Drive Folder.

Pix2pix

Setup

  • Download the pix2pix dataset (e.g., cityscapes).

    bash datasets/download_pix2pix_dataset.sh cityscapes

Cityscapes Dataset

For the Cityscapes dataset, we cannot provide it due to license issue. Please download the dataset from https://cityscapes-dataset.com and use the script prepare_cityscapes_dataset.py to preprocess it. You need to download gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip and unzip them in the same folder. For example, you may put gtFine and leftImg8bit in database/cityscapes-origin. You need to prepare the dataset with the following commands:

python datasets/get_trainIds.py database/cityscapes-origin/gtFine/
python datasets/prepare_cityscapes_dataset.py \
--gtFine_dir database/cityscapes-origin/gtFine \
--leftImg8bit_dir database/cityscapes-origin/leftImg8bit \
--output_dir database/cityscapes \
--table_path datasets/table.txt

You will get a preprocessed dataset in database/cityscapes and a mapping table (used to compute mIoU) in dataset/table.txt.

  • Get the statistical information for the ground-truth images for your dataset to compute FID. We provide pre-prepared real statistics for several datasets. For example,

    bash datasets/download_real_stat.sh cityscapes A

Evaluation Preparation

mIoU Computation

To support mIoU computation, you need to download a pre-trained DRN model drn-d-105_ms_cityscapes.pth from http://go.yf.io/drn-cityscapes-models. By default, we put the drn model in the root directory of our repo. Then you can test our compressed models on cityscapes after you have downloaded our compressed models.

FID/KID Computation

To compute the FID/KID score, you need to get some statistical information from the groud-truth images of your dataset. We provide a script get_real_stat.py to extract statistical information. For example, for the map2arial dataset, you could run the following command:

python get_real_stat.py \
--dataroot database/map2arial \
--output_path real_stat/maps_B.npz \
--direction AtoB

For paired image-to-image translation (pix2pix and GauGAN), we calculate the FID between generated test images to real test images. For unpaired image-to-image translation (CycleGAN), we calculate the FID between generated test images to real training+test images. This allows us to use more images for a stable FID evaluation, as done in previous unconditional GANs research. The difference of the two protocols is small. The FID of our compressed CycleGAN model increases by 4 when using real test images instead of real training+test images.

KID is not supported for the cityscapes dataset.

Model Training

Teacher Training

The first step of our framework is to train a teacher model. For this purpose, please run the script train_inception_teacher.sh under the correponding folder named as the dataset, for example, run

bash scripts/cycle_gan/horse2zebra/train_inception_teacher.sh

Student Training

With the pretrained teacher model, we can determine the architecture of student model under prescribed computational budget. For this purpose, please run the script train_inception_student_XXX.sh under the correponding folder named as the dataset, where XXX stands for the computational budget (in terms of FLOPs for this case) and can be different for different datasets and models. For example, for CycleGAN with Horse2Zebra dataset, our computational budget is 2.6B FLOPs, so we run

bash scripts/cycle_gan/horse2zebra/train_inception_student_2p6B.sh

Pre-trained Models

For convenience, we also provide pretrained teacher and student models on Google Drive Folder.

Model Evaluation

With pretrained teacher and student models, we can evaluate them on the dataset. For this purpose, please run the script evaluate_inception_student_XXX.sh under the corresponding folder named as the dataset, where XXX is the computational budget (in terms of FLOPs). For example, for CycleGAN with Horse2Zebra dataset where the computational budget is 2.6B FLOPs, please run

bash scripts/cycle_gan/horse2zebra/evaluate_inception_student_2p6B.sh

Model Export

The final step is to export the trained compressed model as onnx file to run on mobile devices. For this purpose, please run the script onnx_export_inception_student_XXX.sh under the corresponding folder named as the dataset, where XXX is the computational budget (in terms of FLOPs). For example, for CycleGAN with Horse2Zebra dataset where the computational budget is 2.6B FLOPs, please run

bash scripts/cycle_gan/horse2zebra/onnx_export_inception_student_2p6B.sh

This will create one .onnx file in addition to log files.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{jin2021teachers,
  title={Teachers Do More Than Teach: Compressing Image-to-Image Models},
  author={Jin, Qing and Ren, Jian and Woodford, Oliver J and Wang, Jiazhuo and Yuan, Geng and Wang, Yanzhi and Tulyakov, Sergey},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Acknowledgements

Our code is developed based on AtomNAS and gan-compression.

We also thank pytorch-fid for FID computation and drn for mIoU computation.

Owner
Snap Research
Snap Research
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
Source Code For Template-Based Named Entity Recognition Using BART

Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h

174 Dec 19, 2022
Fuzzy Overclustering (FOC)

Fuzzy Overclustering (FOC) In real-world datasets, we need consistent annotations between annotators to give a certain ground-truth label. However, in

2 Nov 08, 2022
SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

SciKit-Learn Laboratory This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. O

ETS 528 Nov 25, 2022
Rendering color and depth images for ShapeNet models.

Color & Depth Renderer for ShapeNet This library includes the tools for rendering multi-view color and depth images of ShapeNet models. Physically bas

Yinyu Nie 41 Dec 19, 2022
CVPR2021 Workshop - HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization.

HDRUNet [Paper Link] HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization By Xiangyu Chen, Yihao Liu, Zhengwen Zhang, Yu Qiao an

XyChen 105 Dec 20, 2022
Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

Environment Inference for Invariant Learning This code accompanies the paper Environment Inference for Invariant Learning, which appears at ICML 2021.

Elliot Creager 40 Dec 09, 2022
Use of Attention Gates in a Convolutional Neural Network / Medical Image Classification and Segmentation

Attention Gated Networks (Image Classification & Segmentation) Pytorch implementation of attention gates used in U-Net and VGG-16 models. The framewor

Ozan Oktay 1.6k Dec 30, 2022
An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

CLCC: Contrastive Learning for Color Constancy (CVPR 2021) Yi-Chen Lo*, Chia-Che Chang*, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang,

Yi-Chen (Howard) Lo 58 Dec 17, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW

Martin Knoche 10 Dec 12, 2022
Tensorflow Implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (ICML 2017 workshop)

tf-SNDCGAN Tensorflow implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (https://www.researchgate.net/publicati

Nhat M. Nguyen 248 Nov 25, 2022
Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E

Robotics and Perception Group 69 Dec 26, 2022
Pose estimation with MoveNet Lightning

Pose Estimation With MoveNet Lightning MoveNet is the TensorFlow pre-trained model that identifies 17 different key points of the human body. It is th

Yash Vora 2 Jan 04, 2022
Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

Meta-SparseINR Official PyTorch implementation of "Meta-learning Sparse Implicit Neural Representations" (NeurIPS 2021) by Jaeho Lee*, Jihoon Tack*, N

Jaeho Lee 41 Nov 10, 2022
ACL'2021: LM-BFF: Better Few-shot Fine-tuning of Language Models

LM-BFF (Better Few-shot Fine-tuning of Language Models) This is the implementation of the paper Making Pre-trained Language Models Better Few-shot Lea

Princeton Natural Language Processing 607 Jan 07, 2023
Yolact-keras实例分割模型在keras当中的实现

Yolact-keras实例分割模型在keras当中的实现 目录 性能情况 Performance 所需环境 Environment 文件下载 Download 训练步骤 How2train 预测步骤 How2predict 评估步骤 How2eval 参考资料 Reference 性能情况 训练数

Bubbliiiing 11 Dec 26, 2022
Simple tools for logging and visualizing, loading and training

TNT TNT is a library providing powerful dataloading, logging and visualization utilities for Python. It is closely integrated with PyTorch and is desi

1.5k Jan 02, 2023
Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Natural Posterior Network This repository provides the official implementation o

Oliver Borchert 54 Dec 06, 2022
Easy-to-use micro-wrappers for Gym and PettingZoo based RL Environments

SuperSuit introduces a collection of small functions which can wrap reinforcement learning environments to do preprocessing ('microwrappers'). We supp

Farama Foundation 357 Jan 06, 2023