Orthogonal Over-Parameterized Training

Overview

Orthogonal Over-Parameterized Training

By Weiyang Liu, Rongmei Lin, Zhen Liu, James Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

License

OPT is released under the MIT License (refer to the LICENSE file for details).

Contents

  1. Introduction
  2. Citation
  3. Short Video Introduction
  4. Requirements
  5. Usage

Introduction

The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is of great importance. We propose a novel orthogonal over-parameterized training (OPT) framework that can provably minimize the hyperspherical energy which characterizes the diversity of neurons on a hypersphere. See our previous work -- MHE for an in-depth introduction.

By maintaining the minimum hyperspherical energy during training, OPT can greatly improve the empirical generalization. Specifically, OPT fixes the randomly initialized weights of the neurons and learns an orthogonal transformation that applies to these neurons. We consider multiple ways to learn such an orthogonal transformation, including unrolling orthogonalization algorithms, applying orthogonal parameterization, and designing orthogonality-preserving gradient descent. For better scalability, we propose the stochastic OPT which performs orthogonal transformation stochastically for partial dimensions of neurons.

Our OPT is accepted to CVPR 2021 as oral presentation and the full paper is available on arXiv and here.

Citation

If you find our work useful in your research, please consider to cite:

@InProceedings{Liu2021OPT,
    title={Orthogonal Over-Parameterized Training},
    author={Liu, Weiyang and Lin, Rongmei and Liu, Zhen and Rehg, James M. and Paull, Liam 
     and Xiong, Li and Song, Le and Weller, Adrian},
    booktitle={CVPR},
    year={2021}
}

Short Video Introduction

We also provide a short video introduction to help interested readers quickly go over our work and understand the essence of OPT. Please click the following figure to watch the Youtube video.

OPT_talk

Requirements

  1. Python 3.7
  2. TensorFlow 1.14.0

Usage

This repository provides both OPT and S-OPT implementations on CIFAR-100 as a demostration.

Part 1: Clone the repositary

git clone https://github.com/wy1iu/OPT.git

Part 2: Download the official CIFAR-100 training and testing data (python version)

wget https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz

Part 3: Train and test with the following code in different folder.

# Run Cayley Parameterization OPT
cd opt_cp
python train.py
# Run Gram-Schmidt OPT
cd opt_gs
python train.py
# Run Householder Reflection OPT
cd opt_hr
python train.py
# Run Lowdin’s Symmetric OPT
cd opt_ls
python train.py
# Run Orthogonality-Preserving Gradient Descent OPT
cd opt_ogd
python train.py
# Run Stochastic OPT (Gram-Schmidt)
cd sopt_gs
python train.py

Contact

Owner
Weiyang Liu
Weiyang Liu
Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Duong H. Le 18 Jun 13, 2022
Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

Keras 57k Jan 09, 2023
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
An open source app to help calm you down when needed.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 2 Oct 24, 2022
Le dataset des images du projet d'IA de 2021

face-mask-dataset-ilc-2021 Le dataset des images du projet d'IA de 2021, Indiquez vos id git dans la issue pour les droits TL;DR: Choisir 200 images J

7 Nov 15, 2021
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
Scheme for training and applying a label propagation framework

Factorisation-based Image Labelling Overview This is a scheme for training and applying the factorisation-based image labelling (FIL) framework. Some

Wellcome Centre for Human Neuroimaging 2 Dec 17, 2021
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Uniformer - Pytorch Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification ta

Phil Wang 90 Nov 24, 2022
Global Filter Networks for Image Classification

Global Filter Networks for Image Classification Created by Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie Zhou This repository contains PyTorch

Yongming Rao 273 Dec 26, 2022
Camera calibration & 3D pose estimation tools for AcinoSet

AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the Wild Daniel Joska, Liam Clark, Naoya Muramatsu, Ricardo Jericevich, Fre

African Robotics Unit 42 Nov 16, 2022
Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

23 Oct 21, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
Implementation of momentum^2 teacher

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning Requirements All experiments are done with python3.6, torch

jemmy li 121 Sep 26, 2022
ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

25 Nov 09, 2022
Consecutive-Subsequence - Simple software to calculate susequence with highest sum

Simple software to calculate susequence with highest sum This repository contain

Gbadamosi Farouk 1 Jan 31, 2022
Pytorch implementation for the paper: Contrastive Learning for Cold-start Recommendation

Contrastive Learning for Cold-start Recommendation This is our Pytorch implementation for the paper: Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan L

45 Dec 13, 2022
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation

Maxim Zaika 1 Nov 17, 2021
Object Detection Projekt in GKI WS2021/22

tfObjectDetection Object Detection Projekt with tensorflow in GKI WS2021/22 Docker Container: docker run -it --name --gpus all -v path/to/project:p

Tim Eggers 1 Jul 18, 2022