Structured Data Gradient Pruning (SDGP)

Related tags

Deep Learningsdgp
Overview

Structured Data Gradient Pruning (SDGP)

Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by reducing the number of model parameters over the course of training. However, most weight pruning techniques generally does not speed up DNN training and can even require more iterations to reach model convergence. In this work, we propose a novel Structured Data Gradient Pruning (SDGP) method that can speed up training without impacting model convergence. This approach enforces a specific sparsity structure, where only N out of every M elements in a matrix can be nonzero, making it amenable to hardware acceleration. Modern accelerators such as the Nvidia A100 GPU support this type of structured sparsity for 2 nonzeros per 4 elements in a reduction. Assuming hardware support for 2:4 sparsity, our approach can achieve a 15-25% reduction in total training time without significant impact to performance.

Implementation Details

Check out sdgp.py for details on how the data gradients are pruned during backpropagation. To make the pruning more efficient under group-level sorting, we implemented our own CUDA kernel. This is tested only with CUDA 11.3 and PyTorch 1.10.2 using Python 3.9.

Training Configuration

Training generally follows the configuration details in the excellent ffcv library. To fit ImageNet in a system with 256 GB of RAM using the ffcv data loader, we decreased the image size and other settings from (500, 0.5, 90) which takes 337GB to (448, 0.60, 90) which takes 229GB. We did not observe any decrease in performance comapared to the results posted in the ffcv repository on either ResNet-18 or ResNet-50 using these slightly smaller images.

CIFAR-10

SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
None (dense) 4 4 95.3 link link
Random 2 4 94.5 link link
Magnitude 2 4 95.2 link link
Rescale Mag. 1 4 95.1 link link
Rescale Mag. 2 4 95.2 link link
Rescale Mag. 1 8 94.7 link link
Rescale Mag. 2 8 95.1 link link
Rescale Mag. 4 8 95.2 link link
Rescale Mag. 2 16 95.1 link link
Rescale Mag. 4 16 95.2 link link
Rescale Mag. 8 16 95.2 link link
Rescale Mag. 4 32 94.9 link link
Rescale Mag. 8 32 95.3 link link
Rescale Mag. 16 32 95.3 link link

ImageNet

Model SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
ResNet-18 None (dense) 4 4 71.4 link link
ResNet-18 Random 2 4 64.3 link link
ResNet-18 Magnitude 2 4 72.1 link link
ResNet-18 Rescale Mag. 2 4 72.4 link link
ResNet-50 None (dense) 4 4 78.1 link link
ResNet-50 Random 2 4 70.3 link link
ResNet-50 Magnitude 2 4 77.7 link link
ResNet-50 Rescale Mag. 2 4 77.6 link link
RegNetX-400MF None (dense) 4 4 73.3 link link
RegNetX-400MF Random 2 4 64.3 link link
RegNetX-400MF Magnitude 2 4 72.1 link link
RegNetX-400MF Rescale Mag. 2 4 72.4 link link
Owner
Bradley McDanel
Bradley McDanel
Hysterese plugin with two temperature offset areas

craftbeerpi4 plugin OffsetHysterese Temperatur-Steuerungs-Plugin mit zwei tempereaturbereich abhängigen Offsets. Installation sudo pip3 install https:

HappyHibo 1 Dec 21, 2021
TSIT: A Simple and Versatile Framework for Image-to-Image Translation

TSIT: A Simple and Versatile Framework for Image-to-Image Translation This repository provides the official PyTorch implementation for the following p

Liming Jiang 255 Nov 23, 2022
MT3: Multi-Task Multitrack Music Transcription

MT3: Multi-Task Multitrack Music Transcription MT3 is a multi-instrument automatic music transcription model that uses the T5X framework. This is not

Magenta 867 Dec 29, 2022
The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximat

Relation Therapeutics 95 Dec 27, 2022
Customised to detect objects automatically by a given model file(onnx)

LabelImg LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML

Heeone Lee 1 Jun 07, 2022
Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

AgentMaker 20 Oct 18, 2022
Autonomous Movement from Simultaneous Localization and Mapping

Autonomous Movement from Simultaneous Localization and Mapping About us Built by a group of Clarkson University students with the help from Professor

14 Nov 07, 2022
Encoding Causal Macrovariables

Encoding Causal Macrovariables Data Natural climate data ('El Nino') Self-generated data ('Simulated') Experiments Detecting macrovariables through th

Benedikt Höltgen 3 Jul 31, 2022
Open-source implementation of Google Vizier for hyper parameters tuning

Advisor Introduction Advisor is the hyper parameters tuning system for black box optimization. It is the open-source implementation of Google Vizier w

tobe 1.5k Jan 04, 2023
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

Borui Zhang 39 Dec 10, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt - PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The

Keon Lee 47 Dec 18, 2022
Discord bot-CTFD-Thread-Parser - Discord bot CTFD-Thread-Parser

Discord bot CTFD-Thread-Parser Description: This tools is used to create automat

15 Mar 22, 2022
PaRT: Parallel Learning for Robust and Transparent AI

PaRT: Parallel Learning for Robust and Transparent AI This repository contains the code for PaRT, an algorithm for training a base network on multiple

Mahsa 0 May 02, 2022
Video Swin Transformer - PyTorch

Video-Swin-Transformer-Pytorch This repo is a simple usage of the official implementation "Video Swin Transformer". Introduction Video Swin Transforme

Haofan Wang 116 Dec 20, 2022
This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF).

VaxNeRF Paper | Google Colab This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF). This codebase is implemented using JAX, buildin

naruya 132 Nov 21, 2022
FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

FairEdit Relevent Publication FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

5 Feb 04, 2022
(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

Higher-Order Transformers Kim J, Oh S, Hong S, Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs, NeurIPS 2021. [arxiv] W

Jinwoo Kim 44 Dec 28, 2022
Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

Alexander Hirner 169 Jun 29, 2022
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022