Model Zoo for AI Model Efficiency Toolkit

Overview

Qualcomm Innovation Center, Inc.

Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance. Results demonstrate that quantized models can provide good accuracy, comparable to floating point models. Together with results, we also provide recipes for users to quantize floating-point models using the AI Model Efficiency ToolKit (AIMET).

Table of Contents

Introduction

Quantized inference is significantly faster than floating-point inference, and enables models to run in a power-efficient manner on mobile and edge devices. We use AIMET, a library that includes state-of-the-art techniques for quantization, to quantize various models available in TensorFlow and PyTorch frameworks. The list of models is provided in the sections below.

An original FP32 source model is quantized either using post-training quantization (PTQ) or Quantization-Aware-Training (QAT) technique available in AIMET. Example scripts for evaluation are provided for each model. When PTQ is needed, the evaluation script performs PTQ before evaluation. Wherever QAT is used, the fine-tuned model checkpoint is also provided.

Tensorflow Models

Model Zoo

Network Model Source [1] Floating Pt (FP32) Model [2] Quantized Model [3] Results [4] Documentation
ResNet-50 (v1) GitHub Repo Pretrained Model See Documentation (ImageNet) Top-1 Accuracy
FP32: 75.21%
INT8: 74.96%
ResNet50.md
MobileNet-v2-1.4 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 75%
INT8: 74.21%
MobileNetV2.md
EfficientNet Lite GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 74.93%
INT8: 74.99%
EfficientNetLite.md
SSD MobileNet-v2 GitHub Repo Pretrained Model See Example (COCO) Mean Avg. Precision (mAP)
FP32: 0.2469
INT8: 0.2456
SSDMobileNetV2.md
RetinaNet GitHub Repo Pretrained Model See Example (COCO) mAP
FP32: 0.35
INT8: 0.349
Detailed Results
RetinaNet.md
Pose Estimation Based on Ref. Based on Ref. Quantized Model (COCO) mAP
FP32: 0.383
INT8: 0.379,
Mean Avg.Recall (mAR)
FP32: 0.452
INT8: 0.446
PoseEstimation.md
SRGAN GitHub Repo Pretrained Model See Example (BSD100) PSNR/SSIM
FP32: 25.45/0.668
INT8: 24.78/0.628
INT8W/INT16Act.: 25.41/0.666
Detailed Results
SRGAN.md

[1] Original FP32 model source
[2] FP32 model checkpoint
[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit activations (INT8W/INT16Act.) are used to further improve performance of post-training quantization.
[4] Results comparing float and quantized performance
[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

RetinaNet

(COCO dataset)

Average Precision/Recall @[ IoU | area | maxDets] FP32 INT8
Average Precision @[ 0.50:0.95 | all | 100 ] 0.350 0.349
Average Precision @[ 0.50 | all | 100 ] 0.537 0.536
Average Precision @[ 0.75 | all | 100 ] 0.374 0.372
Average Precision @[ 0.50:0.95 | small | 100 ] 0.191 0.187
Average Precision @[ 0.50:0.95 | medium | 100 ] 0.383 0.381
Average Precision @[ 0.50:0.95 | large | 100 ] 0.472 0.472
Average Recall @[ 0.50:0.95 | all | 1 ] 0.306 0.305
Average Recall @[0.50:0.95 | all | 10 ] 0.491 0.490
Average Recall @[ 0.50:0.95 | all |100 ] 0.533 0.532
Average Recall @[ 0.50:0.95 | small | 100 ] 0.345 0.341
Average Recall @[ 0.50:0.95 | medium | 100 ] 0.577 0.577
Average Recall @[ 0.50:0.95 | large | 100 ] 0.681 0.679

SRGAN

Model Dataset PSNR SSIM
FP32 Set5/Set14/BSD100 29.17/26.17/25.45 0.853/0.719/0.668
INT8/ACT8 Set5/Set14/BSD100 28.31/25.55/24.78 0.821/0.684/0.628
INT8/ACT16 Set5/Set14/BSD100 29.12/26.15/25.41 0.851/0.719/0.666

PyTorch Models

Model Zoo

Network Model Source [1] Floating Pt (FP32) Model [2] Quantized Model [3] Results [4] Documentation
MobileNetV2 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 71.67%
INT8: 71.14%
MobileNetV2.md
EfficientNet-lite0 GitHub Repo Pretrained Model Quantized Model (ImageNet) Top-1 Accuracy
FP32: 75.42%
INT8: 74.44%
EfficientNet-lite0.md
DeepLabV3+ GitHub Repo Pretrained Model Quantized Model (PascalVOC) mIOU
FP32: 72.62%
INT8: 72.22%
DeepLabV3.md
MobileNetV2-SSD-Lite GitHub Repo Pretrained Model Quantized Model (PascalVOC) mAP
FP32: 68.7%
INT8: 68.6%
MobileNetV2-SSD-lite.md
Pose Estimation Based on Ref. Based on Ref. Quantized Model (COCO) mAP
FP32: 0.364
INT8: 0.359
mAR
FP32: 0.436
INT8: 0.432
PoseEstimation.md
SRGAN GitHub Repo Pretrained Model (older version from here) See Example (BSD100) PSNR/SSIM
FP32: 25.51/0.653
INT8: 25.5/0.648
Detailed Results
SRGAN.md
DeepSpeech2 GitHub Repo Pretrained Model See Example (Librispeech Test Clean) WER
FP32
9.92%
INT8: 10.22%
DeepSpeech2.md

[1] Original FP32 model source
[2] FP32 model checkpoint
[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit weights are used to further improve performance of post-training quantization.
[4] Results comparing float and quantized performance
[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

SRGAN Pytorch

Model Dataset PSNR SSIM
FP32 Set5/Set14/BSD100 29.93/26.58/25.51 0.851/0.709/0.653
INT8 Set5/Set14/BSD100 29.86/26.59/25.55 0.845/0.705/0.648

Examples

Install AIMET

Before you can run the example script for a specific model, you need to install the AI Model Efficiency ToolKit (AIMET) software. Please see this Getting Started page for an overview. Then install AIMET and its dependencies using these Installation instructions.

NOTE: To obtain the exact version of AIMET software that was used to test this model zoo, please install release 1.13.0 when following the above instructions.

Running the scripts

Download the necessary datasets and code required to run the example for the model of interest. The examples run quantized evaluation and if necessary apply AIMET techniques to improve quantized model performance. They generate the final accuracy results noted in the table above. Refer to the Docs for TensorFlow or PyTorch folder to access the documentation and procedures for a specific model.

Team

AIMET Model Zoo is a project maintained by Qualcomm Innovation Center, Inc.

License

Please see the LICENSE file for details.

Comments
  • Added PyTorch FFNet model, added INT4 to several models

    Added PyTorch FFNet model, added INT4 to several models

    Added the following new model: PyTorch FFNet Added INT4 quantization support to the following models:

    • Pytorch Classification (regnet_x_3_2gf, resnet18, resnet50)
    • PyTorch HRNet Posenet
    • PyTorch HRNet
    • PyTorch EfficientNet Lite0
    • PyTorch DeeplabV3-MobileNetV2

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • Added TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models

    Added TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models

    Added two new models - TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models Fixed TF version for 2 models in README file Minor updates to Tensorflow EfficientNet Lite-0 doc and PyTorch ssd_mobilenetv2 script

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • Updated post estimation evaluation code and documentation for updated…

    Updated post estimation evaluation code and documentation for updated…

    … model .pth file with weights state-dict Fixed model loading problem by including model definition in pose_estimation_quanteval.py Add Quantizer Op Assumptions to Pose Estimation document

    Signed-off-by: Bharath Ramaswamy [email protected]

    opened by quic-bharathr 0
  • error when run the pose estimation example

    error when run the pose estimation example

    $ python3.6 pose_estimation_quanteval.py pe_weights.pth ./data/

    2022-05-24 22:37:22,500 - root - INFO - AIMET defining network with shared weights Traceback (most recent call last): File "pose_estimation_quanteval.py", line 700, in pose_estimation_quanteval(args) File "pose_estimation_quanteval.py", line 687, in pose_estimation_quanteval sim = quantsim.QuantizationSimModel(model, dummy_input=(1, 3, 128, 128), quant_scheme=args.quant_scheme) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/quantsim.py", line 157, in init self.connected_graph = ConnectedGraph(self.model, dummy_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 132, in init self._construct_graph(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 254, in _construct_graph module_tensor_shapes_map = ConnectedGraph._generate_module_tensor_shapes_lookup_table(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 244, in _generate_module_tensor_shapes_lookup_table run_hook_for_layers_with_given_input(model, model_input, forward_hook, leaf_node_only=False) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/utils.py", line 277, in run_hook_for_layers_with_given_input _ = model(*input_tensor) File "/home/jlchen/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1071, in _call_impl result = forward_call(*input, **kwargs) TypeError: forward() takes 2 positional arguments but 5 were given

    opened by sundyCoder 0
  • I try to quantize deepspeech demo,but error happend

    I try to quantize deepspeech demo,but error happend

    ImportError: /home/mi/anaconda3/envs/aimet/lib/python3.7/site-packages/aimet_common/x86_64-linux-gnu/aimet_tensor_quantizer-0.0.0-py3.7-linux-x86_64.egg/AimetTensorQuantizer.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor8data_ptrIfEEPT_v

    platform:Ubuntu 18.04 GPU: nvidia 2070 CUDA:11.1 pytorch python:3.7

    opened by fmbao 0
  • Request for the MobileNet-V1-1.0 quantized (INT8) model.

    Request for the MobileNet-V1-1.0 quantized (INT8) model.

    Thank you for sharing these valuable models. I'd like to evaluate and look into the 'MobileNet-v1-1.0' model quantized by the DFQ. I'd appreciate it if you could provide the quantized MobileNet-v1-1.0 model either in TF or in PyTorch.

    opened by yschoi-dev 0
  • What's the runtime and AI Framework for DeepSpeech2?

    What's the runtime and AI Framework for DeepSpeech2?

    For DeepSpeech2, may I know what's the runtime for it's quantized (INT8 ) model, Hexagan DSP, NPU or others? And what's the AI framework, SNPE, Hexagan NN or others? Thanks~

    opened by sunfangxun 0
  • Unable to replicate DeepLabV3 Pytorch Tutorial numbers

    Unable to replicate DeepLabV3 Pytorch Tutorial numbers

    I've been working through the DeepLabV3 Pytorch tutorial, which can be founded here: https://github.com/quic/aimet-model-zoo/blob/develop/zoo_torch/Docs/DeepLabV3.md.

    However, when running the evaluation script using optimized checkpoint, I am unable to replicate the mIOU result that was listed in the table. The number that I got was 0.67 while the number reported by Qualcomm was 0.72. I was wondering if anyone have had this issue before and how to resolve it ?

    opened by LLNLanLeN 3
Releases(repo_restructured_1)
Owner
Qualcomm Innovation Center
Qualcomm Innovation Center
NLU Dataset Diagnostics

NLU Dataset Diagnostics This repository contains data and scripts to reproduce the results from our paper: Aarne Talman, Marianna Apidianaki, Stergios

Language Technology at the University of Helsinki 1 Jul 20, 2022
My personal Home Assistant configuration.

About This is my personal Home Assistant configuration. My guiding princile is to have full local control of all my devices. I intend everything to ru

Chris Turra 13 Jun 07, 2022
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

Iffi 348 Dec 24, 2022
Wav2Vec for speech recognition, classification, and audio classification

Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your

Mehrdad Farahani 140 Dec 15, 2022
Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

CorDA Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation Prerequisite Please create and activate the follo

Qin Wang 60 Nov 30, 2022
上海交通大学全自动抢课脚本,支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Welcome to Course-Bullying-in-SJTU-v3.1! 2021/6/8 紧急更新v3.1 更新说明 为了更好地保护用户隐私,将原来用户名+密码的登录方式改为微信扫二维码+cookie登录方式,不再需要配置使用pytesseract。在使用扫码登录模式时,请稍等,二维码将马

87 Sep 13, 2022
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 03, 2023
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
An OpenAI Gym environment for Super Mario Bros

gym-super-mario-bros An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) us

Andrew Stelmach 1 Jan 05, 2022
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis 🙈 A more detailed re

Lincedo Lab 4 Jun 09, 2021
audioLIME: Listenable Explanations Using Source Separation

audioLIME This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music info

Institute of Computational Perception 27 Dec 01, 2022
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022
Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

CLIN-X (CLIN-X-ES) & (CLIN-X-EN) This repository holds the companion code for the system reported in the paper: "CLIN-X: pre-trained language models a

Bosch Research 4 Dec 05, 2022
PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

SLAPS-GNN This repo contains the implementation of the model proposed in SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks

60 Dec 22, 2022
Repository of continual learning papers

Continual learning paper repository This repository contains an incomplete (but dynamically updated) list of papers exploring continual learning in ma

29 Jan 05, 2023
Ludwig Benchmarking Toolkit

Ludwig Benchmarking Toolkit The Ludwig Benchmarking Toolkit is a personalized benchmarking toolkit for running end-to-end benchmark studies across an

HazyResearch 17 Nov 18, 2022
A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions Kapoutsis, A.C., Chatzichristofis,

Athanasios Ch. Kapoutsis 5 Oct 15, 2022
Python-based Informatics Kit for Analysing Chemical Units

INSTALLATION Python-based Informatics Kit for the Analysis of Chemical Units Step 1: Make a conda environment: conda create -n pikachu python=3.9 cond

47 Dec 23, 2022
paper list in the area of reinforcenment learning for recommendation systems

paper list in the area of reinforcenment learning for recommendation systems

HenryZhao 23 Jun 09, 2022