TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Last update: Oct 26, 2022

Overview

Parameterization of Hypercomplex Multiplications (PHM)

This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication) layers and PHM-Transformers in the paper Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters at ICLR 2021.

Installation

One may install the following libraries before running our code:

tensorflow-gpu (1.14.0)
tensor2tensor (1.14.0)

Usage

The usage of this repository follows the original tensor2tensor repository (e.g., t2t-datagen, t2t-trainer, t2t-avg-all, followed by t2t-decoder). It helps to gain familiarity on tensor2tensor before attempting to run our code. Specifically, setting --t2t_usr_dir=./Parameterization-of-Hypercomplex-Multiplications will allow tensor2tensor to register PHM-Transformers.

Training

For example, to evaluate PHM-Transformer (n=4) on the En-Vi machine translation task (t2t-datagen --problem=translate_envi_iwslt32k), one may set the following flags when training:

t2t-trainer \
--problem=translate_envi_iwslt32k \
--model=light_transformer \
--hparams_set=light_transformer_base_single_gpu \
--hparams="light_mode='random',hidden_size=512,factor=4" \
--train_steps=50000

where light_transformer with light_mode='random' is the alias of the PHM-Transformer in our implementation.

Aggretating Checkpoints

After training, the latest 8 checkpoints are averaged:

t2t-avg-all --model_dir $TRAIN_DIR --output_dir $AVG_DIR --n 8

where $TRAIN_DIR and $AVG_DIR need to be specified by users.

Testing

To decode the target sequence, one has to additionally set the decode_hparams as follows:

t2t-decoder \
--decode_hparams="beam_size=5,alpha=0.6"

Then t2t-bleu is invoked for calculating the BLEU.

PHM Implementations

PHM is implemented with operations in make_random_mul and random_ffn, which are mathematically equivalent to sum of Kronecker products.

Among works that use PHM, some have offered alternative PHM implementations:

Citation

If you find this repository helpful, please cite our paper:

@inproceedings{zhang2021beyond,
  title={Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters},
  author={Zhang, Aston and Tay, Yi and Zhang, Shuai and Chan, Alvin and Luu, Anh Tuan and Hui, ‪Siu Cheung and Fu, Jie},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Related tags

Overview

Parameterization of Hypercomplex Multiplications (PHM)

Installation

Usage

Training

Aggretating Checkpoints

Testing

PHM Implementations

Citation

Owner

Aston Zhang

MediaPipe is a an open-source framework from Google for building multimodal

Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently generate a wide range of illumination variants of a single image.

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

Deep Sea Treasure Environment for Multi-Objective Optimization Research

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

This is the accompanying toolbox for the paper "A Survey on GANs for Anomaly Detection"

Attention for PyTorch with Linear Memory Footprint

Cross-Task Consistency Learning Framework for Multi-Task Learning

A very tiny, very simple, and very secure file encryption tool.

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

EMNLP 2021 paper The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers.

PyTorch implementation of the paper The Lottery Ticket Hypothesis for Object Recognition

The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"