Implements Gradient Centralization and allows it to use as a Python package in TensorFlow

Overview

Gradient Centralization TensorFlow Twitter

PyPI Upload Python Package Flake8 Lint Python Version

Binder Open In Colab

GitHub license PEP8 GitHub stars GitHub forks GitHub watchers

This Python package implements Gradient Centralization in TensorFlow, a simple and effective optimization technique for Deep Neural Networks as suggested by Yong et al. in the paper Gradient Centralization: A New Optimization Technique for Deep Neural Networks. It can both speedup training process and improve the final generalization performance of DNNs.

Installation

Run the following to install:

pip install gradient-centralization-tf

Usage

gctf.centralized_gradients_for_optimizer

Create a centralized gradients functions for a specified optimizer.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.

Example:

>>> opt = tf.keras.optimizers.Adam(learning_rate=0.1)
>>> optimizer.get_gradients = gctf.centralized_gradients_for_optimizer(opt)
>>> model.compile(optimizer = opt, ...)

gctf.get_centralized_gradients

Computes the centralized gradients.

This function is ideally not meant to be used directly unless you are building a custom optimizer, in which case you could point get_gradients to this function. This is a modified version of tf.keras.optimizers.Optimizer.get_gradients.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.
  • loss: Scalar tensor to minimize.
  • params: List of variables.

Returns:

A gradients tensor.

gctf.optimizers

Pre built updated optimizers implementing GC.

This module is speciially built for testing out GC and in most cases you would be using gctf.centralized_gradients_for_optimizer though this module implements gctf.centralized_gradients_for_optimizer. You can directly use all optimizers with tf.keras.optimizers updated for GC.

Example:

>>> model.compile(optimizer = gctf.optimizers.adam(learning_rate = 0.01), ...)
>>> model.compile(optimizer = gctf.optimizers.rmsprop(learning_rate = 0.01, rho = 0.91), ...)
>>> model.compile(optimizer = gctf.optimizers.sgd(), ...)

Returns:

A tf.keras.optimizers.Optimizer object.

Developing gctf

To install gradient-centralization-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone [email protected]:Rishit-dagli/Gradient-Centralization-TensorFlow
# or clone your own fork

pip install -e .[dev]

License

Copyright 2020 Rishit Dagli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Comments
  • On windows Tensorflow 2.5 it gives error

    On windows Tensorflow 2.5 it gives error

    On windows 10 with miniconda enviroment tensorflow 2.5 gives error on centralized_gradients.py file.

    the solution is change import keras.backend as K with import tensorflow.keras.backend as K

    bug 
    opened by mgezer 5
  • The results in the mnist example are wrong/misleading

    The results in the mnist example are wrong/misleading

    Describe the bug The results in your colab ipython notebook are misleading: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    In this example, the model is first trained with a normal Adam optimizer:

    model.compile(optimizer = tf.keras.optimizers.Adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['accuracy'])
    
    history_no_gctf = model.fit(training_images, training_labels, epochs=5, callbacks = [time_callback_no_gctf])
    

    And afterwards the same model is recompiled with the gctf.optimizers.adam(). However, recompiling a keras model does not reset the weights. This means that in the first fit call the model is trained and then in the second fit call with the new optimizer the same model is used and of course then the results are better.

    This can be fixed, by recreating the model for the second run, by just adding these few lines:

    import gctf #import gctf
    
    time_callback_gctf = TimeHistory()
    
    # Model architecture
    model = tf.keras.models.Sequential([
                                        tf.keras.layers.Flatten(), 
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu), 
                                        tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
    
    model.compile(optimizer = gctf.optimizers.adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    history_gctf = model.fit(training_images, training_labels, epochs=5, callbacks=[time_callback_gctf])
    

    However, then the results are not better than without gctf:

    Type                   Execution time    Accuracy      Loss
    -------------------  ----------------  ----------  --------
    Model without gctf:           24.7659    0.88825   0.305801
    Model with gctf               24.7881    0.889567  0.30812
    

    Could you please clarify what happens here. I tried this gctf.optimizers.adam() optimizer in my own research and it didn't change the results at all and now after seeing it doesn't work in the example which was constructed here. Makes me question the results of this paper.

    To Reproduce Execute the colab file given in the repository: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    Expected behavior The right comparison would be if both models start from a random initialization, not that the second model can start with the already pre-trained weights.

    Looking forward to a fast a swift explanation.

    Best, Max

    question 
    opened by themasterlink 2
  • Wider dependency requirements

    Wider dependency requirements

    The package as of now to be installed requires tensorflow ~= 2.4.0 and keras ~= 2.4.0. It turns out that this is sometimes problematic for folks who have custom installations of TensorFlow and a winder requirement could be set up.

    enhancement 
    opened by Rishit-dagli 1
  • Release 0.0.3

    Release 0.0.3

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    opened by Rishit-dagli 0
  • Add an

    Add an "About The Examples" section

    Add an "About The Examples" section which contains a summary of the usage example notebooks and links to run it on Binder and Colab.


    Close #15

    opened by Rishit-dagli 0
  • Update relevant pypi classifiers

    Update relevant pypi classifiers

    Add PyPI classifiers for:

    • Development status
    • Intended Audience
    • Topic

    Further also added the Programming Language :: Python :: 3 :: Only classifer


    Closes #18

    opened by Rishit-dagli 0
  • Update pypi classifiers

    Update pypi classifiers

    I am specifically thinking of adding three more categories of pypi classifiers:

    • Development status
    • Intended Audience
    • Topic

    Apart from this I also think it would be great to add the Programming Language :: Python :: 3 :: Only to make sure the audience to know that this package is intended for Python 3 only.

    opened by Rishit-dagli 0
  • Add an

    Add an "About the examples" section

    It would be great to write an "About the example" section which could demonstrate in short what the example notebooks aim to achieve and show.

    documentation 
    opened by Rishit-dagli 0
  • Error in usage example for gctf.centralized_gradients_for_optimizer

    Error in usage example for gctf.centralized_gradients_for_optimizer

    I noticed that the docstrings for gctf.centralized_gradients_for_optimizer have an error in the example usage section. The example creates an Adam optimizer instance and saves it to opt however the centralized_gradients_for_optimizer is applied on optimizer which ideally does not exist and running the example would result in an error.

    documentation 
    opened by Rishit-dagli 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
Releases(v0.0.3)
  • v0.0.3(Mar 11, 2021)

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Feb 21, 2021)

    This release includes some fixes and improvements

    βœ… Bug Fixes / Improvements

    • Fix the issue of supporting multiple modules
    • Fix multiple typos.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Feb 20, 2021)

Owner
Rishit Dagli
High School, Ted-X, Ted-Ed speaker|Mentor, TFUG Mumbai|International Speaker|Microsoft Student Ambassador|#ExploreML Facilitator
Rishit Dagli
A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey pap

svandenh 297 Dec 17, 2022
source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

International Business Machines 71 Nov 15, 2022
MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

OpenMMLab 3.2k Jan 05, 2023
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in

Hao Ren 8 Dec 23, 2022
A containerized REST API around OpenAI's CLIP model.

OpenAI's CLIP β€” REST API This is a container wrapping OpenAI's CLIP model in a RESTful interface. Running the container locally First, build the conta

Santiago Valdarrama 48 Nov 06, 2022
Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

NBFNet: Neural Bellman-Ford Networks This is the official codebase of the paper Neural Bellman-Ford Networks: A General Graph Neural Network Framework

MilaGraph 136 Dec 21, 2022
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022
Code repository for the paper Computer Vision User Entity Behavior Analytics

Computer Vision User Entity Behavior Analytics Code repository for "Computer Vision User Entity Behavior Analytics" Code Description dataset.csv As di

Sameer Khanna 2 Aug 20, 2022
πŸƒβ€β™€οΈ A curated list about human motion capture, analysis and synthesis.

Awesome Human Motion πŸƒβ€β™€οΈ A curated list about human motion capture, analysis and synthesis. Contents Introduction Human Models Datasets Data Process

Dennis Wittchen 274 Dec 14, 2022
FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment

FaceQgen FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment This repository is based on the paper: "FaceQgen: Semi-Supervised D

Javier Hernandez-Ortega 3 Aug 04, 2022
Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral) [Project] [Paper] [Demo] [Related Work: A2RL (for Auto Image Cropping)] [C

Wu Huikai 402 Dec 27, 2022
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

HiFi++ : a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement This is the unofficial implementation of Vocoder part of

Rishikesh (ΰ€‹ΰ€·ΰ€Ώΰ€•ΰ₯‡ΰ€Ά) 118 Dec 29, 2022
Matplotlib Image labeller for classifying images

mpl-image-labeller Use Matplotlib to label images for classification. Works anywhere Matplotlib does - from the notebook to a standalone gui! For more

Ian Hunt-Isaak 5 Sep 24, 2022
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 141 Dec 30, 2022
This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

maggiez 87 Dec 21, 2022
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

472 Dec 22, 2022
CowHerd is a partially-observed reinforcement learning environment

CowHerd is a partially-observed reinforcement learning environment, where the player walks around an area and is rewarded for milking cows. The cows try to escape and the player can place fences to h

Danijar Hafner 6 Mar 06, 2022
A DeepStack custom model for detecting common objects in dark/night images and videos.

DeepStack_ExDark This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for d

MOSES OLAFENWA 98 Dec 24, 2022
Rotary Transformer

[δΈ­ζ–‡|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University.

bayesian_uncertainty This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University. In this project I build a s

Max David Gupta 1 Feb 13, 2022