How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Last update: Sep 20, 2022

Related tags

Overview

AdamBNN

This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021.

In this work, we explore the intrisic reasons why Adam is superior to other optimizers like SGD for BNN optimization and provide analytical explanations that support specific training strategies. By visualizing the optimization trajectory, we show that the optimization lies in extremely rugged loss landscape and the second-order momentum in Adam is crucial to revitalize the weights that are dead due to the activation saturation in BNNs. Based on analysis, we derive a specific training scheme and achieve 70.5% top-1 accuracy on the ImageNet dataset using the same achitecture as ReActNet while achieving 1.1% higher accuracy.

Citation

If you find our code useful for your research, please consider citing:

@conference{liu2021how,
title = {How do adam and training strategies help bnns optimization?},
author = {Liu, Zechun and Shen, Zhiqiang and Li, Shichao and Helwegen, Koen and Huang, Dong and Cheng, Kwang-Ting},
booktitle = {International Conference on Machine Learning},
year = {2021},
organization={PMLR}
}

Run

1. Requirements:

python3, pytorch 1.7.1, torchvision 0.8.2

2. Data:

Download ImageNet dataset

3. Steps to run:

(1) Step1: binarizing activations

Change directory to ./step1/
run bash run.sh

(2) Step2: binarizing weights + activations

Change directory to ./step2/
run bash run.sh

Models

Methods	Backbone	Top1-Acc	FLOPs	Trained Model
ReActNet	ReActNet-A	69.4%	0.87 x 10^8	Model-ReAct
AdamBNN	ReActNet-A	70.5%	0.87 x 10^8	Model-ReAct-AdamBNN-Training

Contact

Zechun Liu, HKUST and CMU (zliubq at connect.ust.hk / zechunl at andrew.cmu.edu)

Zhiqiang Shen, CMU (zhiqians at andrew.cmu.edu)

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Related tags

Overview

AdamBNN

Citation

Run

1. Requirements:

2. Data:

3. Steps to run:

Models

Contact

Owner

Zechun Liu

Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Learning 3D Part Assembly from a Single Image

Multi-Person Extreme Motion Prediction

Official git for "CTAB-GAN: Effective Table Data Synthesizing"

Monitora la qualità della ricezione dei segnali radio nelle province siciliane.

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

The first dataset on shadow generation for the foreground object in real-world scenes.

Fast Neural Representations for Direct Volume Rendering

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

Hummingbird compiles trained ML models into tensor computation for faster inference.

JFB: Jacobian-Free Backpropagation for Implicit Models

Quantized models with python

MegEngine implementation of YOLOX

Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Training Very Deep Neural Networks Without Skip-Connections

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Simple data balancing baselines for worst-group-accuracy benchmarks.

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization