Post-training Quantization for Neural Networks with Provable Guarantees

Last update: Nov 29, 2022

Related tags

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

This directory contains code necessary to run a post-training neural-network quantization method GPFQ, that is based on a greedy path-following mechanism. One can also use it to reproduce the experiment results in our paper "Post-training Quantization for Neural Networks with Provable Guarantees". In this paper, we also prove theoretical guarantees for the proposed method, that is, for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights – i.e., level of over-parametrization.

If you make use of this code or our quantization method in your work, please cite the following paper:

 @article{zhang2022posttraining,
     author = {Zhang, Jinjie and Zhou, Yixuan and Saab, Rayan},
     title = {Post-training Quantization for Neural Networks with Provable Guarantees},
     booktitle = {arXiv preprint arXiv:2201.11113},
     year = {2022}
   }

Note: The code is designed to work primarily with the ImageNet dataset. Due to the size of this dataset, it is likely one may need heavier computational resources than a local machine. Nevertheless, the experiments can be run, for example, using a cloud computation center, e.g. AWS. When we run this experiment, we use the m5.8xlarge EC2 instance with a disk space of 300GB.

Installing Dependencies

We assume a python version that is greater than 3.8.0 is installed in the user's machine. In the root directory of this repo, we provide a requirements.txt file for installing the python libraries that will be used in our code.

To install the necessary dependency, one can first start a virtual environment by doing the following:

python3 -m venv .venv
source .venv/bin/activate

The code above should activate a new python virtual environments.

Then one can make use of the requirements.txt by

pip3 install -r requirement.txt

This should install all the required dependencies of this project.

Obtaining ImageNet Dataset

In this project, we make use of the Imagenet dataset, in particular, we use the ILSVRC-2012 version.

To obtain the Imagenet dataset, one can submit a request through this link.

Once the dataset is obtained, place the .tar files for training set and validation set both under the data/ILSVRC2012 directory of this repo.

Then use the following procedure to unzip Imagenet dataset:

tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
cd ..
# Extract the validation data and move images to subfolders:
tar -xvf ILSVRC2012_img_val.tar

Running Experiments

The implementation of the modified GPFQ in our paper is contained in quantization_scripts. Additionally, adhoc_quantization_scripts and retraining_scripts provide extra experiments and both of them are variants of the framework in quantization_scripts. adhoc_quantization_scripts contains heuristic modifications used to further improve the performance of GPFQ, such as bias correction, mixed precision, and unquantizing the last layer. retraining_scripts shows a quantization-aware training strategy that is designed to retrain the neural network after each layer is quantized.

In this section, we will give a guidance on running our code contained in quantization_scripts and the implementation of other two counterparts adhoc_quantization_scripts and retraining_scripts are very similar to quantization_scripts.

Before getting started, run in the root directory of the repo and run mkdir modelsto create a directory in which we will store the quantized model.
The entry point of the project starts with quantization_scripts/quantize.py. Once the file is opened, there is a section to set hyperparameters, for example, the model_name parameter, the number of bits/batch size used for quantization, the scalar of alphabets, the probability for subsampling in CNNs etc. Note that the model_name mentioned above should be the same as the model that you will quantize. After you selected a model_name and assuming you are still in the root directory of this repo, run mkdir models/{model_name}, where the {model_name} should be the python string that you provided for the model_name parameter in the quantize.py file. If the directory already exists, you can skip this step.
Then navigate to the logs directory and run python3 init_logs.py. This will prepare a log file which is used to store the results of the experiment.
Finally, open the quantization_scripts directory and run python3 quantize.py to start the experiment.

Post-training Quantization for Neural Networks with Provable Guarantees

Related tags

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

Installing Dependencies

Obtaining ImageNet Dataset

Running Experiments

Owner

Yixuan Zhou

Confident Semantic Ranking Loss for Part Parsing

Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation

Model Zoo for AI Model Efficiency Toolkit

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

TRIQ implementation

A hue shift helper for OBS

A GridMixup augmentation, inspired by GridMask and CutMix

Dynamic View Synthesis from Dynamic Monocular Video

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

Neural Network Libraries

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Production First and Production Ready End-to-End Speech Recognition Toolkit

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Deformable DETR is an efficient and fast-converging end-to-end object detector.

deep learning for image processing including classification and object-detection etc.

Research on controller area network Intrusion Detection Systems

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Source code for "OmniPhotos: Casual 360° VR Photography"