TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Overview

Parameterization of Hypercomplex Multiplications (PHM)

This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication) layers and PHM-Transformers in the paper Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters at ICLR 2021.

Installation

One may install the following libraries before running our code:

Usage

The usage of this repository follows the original tensor2tensor repository (e.g., t2t-datagen, t2t-trainer, t2t-avg-all, followed by t2t-decoder). It helps to gain familiarity on tensor2tensor before attempting to run our code. Specifically, setting --t2t_usr_dir=./Parameterization-of-Hypercomplex-Multiplications will allow tensor2tensor to register PHM-Transformers.

Training

For example, to evaluate PHM-Transformer (n=4) on the En-Vi machine translation task (t2t-datagen --problem=translate_envi_iwslt32k), one may set the following flags when training:

t2t-trainer \
--problem=translate_envi_iwslt32k \
--model=light_transformer \
--hparams_set=light_transformer_base_single_gpu \
--hparams="light_mode='random',hidden_size=512,factor=4" \
--train_steps=50000

where light_transformer with light_mode='random' is the alias of the PHM-Transformer in our implementation.

Aggretating Checkpoints

After training, the latest 8 checkpoints are averaged:

t2t-avg-all --model_dir $TRAIN_DIR --output_dir $AVG_DIR --n 8

where $TRAIN_DIR and $AVG_DIR need to be specified by users.

Testing

To decode the target sequence, one has to additionally set the decode_hparams as follows:

t2t-decoder \
--decode_hparams="beam_size=5,alpha=0.6"

Then t2t-bleu is invoked for calculating the BLEU.

PHM Implementations

PHM is implemented with operations in make_random_mul and random_ffn, which are mathematically equivalent to sum of Kronecker products.

Among works that use PHM, some have offered alternative PHM implementations:

Citation

If you find this repository helpful, please cite our paper:

@inproceedings{zhang2021beyond,
  title={Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters},
  author={Zhang, Aston and Tay, Yi and Zhang, Shuai and Chan, Alvin and Luu, Anh Tuan and Hui, ‪Siu Cheung and Fu, Jie},
  booktitle={International Conference on Learning Representations},
  year={2021}
}
Owner
Aston Zhang
Dive into Deep Learning: D2L.ai 《动手学深度学习》: zh.D2L.ai
Aston Zhang
An Api for Emotion recognition.

PLAYEMO Playemo was built from the ground-up with Flask, a python tool that makes it easy for developers to build APIs. Use Cases Is Python your langu

greek geek 2 Jul 16, 2022
This is an official implementation of the High-Resolution Transformer for Dense Prediction.

High-Resolution Transformer for Dense Prediction Introduction This is the official implementation of High-Resolution Transformer (HRT). We present a H

HRNet 403 Dec 13, 2022
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart

Andrew Zeng 36 Dec 19, 2022
This is the repository for our paper Ditch the Gold Standard: Re-evaluating Conversational Question Answering

Ditch the Gold Standard: Re-evaluating Conversational Question Answering This is the repository for our paper Ditch the Gold Standard: Re-evaluating C

Princeton Natural Language Processing 38 Dec 16, 2022
Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

S2AND This repository provides access to the S2AND dataset and S2AND reference model described in the paper S2AND: A Benchmark and Evaluation System f

AI2 54 Nov 28, 2022
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
CCCL: Contrastive Cascade Graph Learning.

CCGL: Contrastive Cascade Graph Learning This repo provides a reference implementation of Contrastive Cascade Graph Learning (CCGL) framework as descr

Xovee Xu 19 Dec 05, 2022
The code for SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network.

SAG-DTA The code is the implementation for the paper 'SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network'. Requirements py

Shugang Zhang 7 Aug 02, 2022
[TPDS'21] COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

COSCO Framework COSCO is an AI based coupled-simulation and container orchestration framework for integrated Edge, Fog and Cloud Computing Environment

imperial-qore 39 Dec 25, 2022
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Pytorch implementation of NEGEV method. Paper: "Negative Evidence Matters in Interpretable Histology Image Classification".

Pytorch 1.10.0 code for: Negative Evidence Matters in Interpretable Histology Image Classification (https://arxiv. org/abs/xxxx.xxxxx) Citation: @arti

Soufiane Belharbi 4 Dec 01, 2022
Code to replicate the key results from Exploring the Limits of Out-of-Distribution Detection

Exploring the Limits of Out-of-Distribution Detection In this repository we're collecting replications for the key experiments in the Exploring the Li

Stanislav Fort 35 Jan 03, 2023
This provides the R code and data to replicate results in "The USS Trustee’s risky strategy"

USSBriefs2021 This provides the R code and data to replicate results in "The USS Trustee’s risky strategy" by Neil M Davies, Jackie Grant and Chin Yan

1 Oct 30, 2021
Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergen

281 Dec 30, 2022
Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

Viewer for MuJoCo in Python Interactive renderer to use with the official Python bindings for MuJoCo. Starting with version 2.1.2, MuJoCo comes with n

Rohan P. Singh 62 Dec 30, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

cross-platform-profile-pic-changer script to change profile pictures across mult

4 Jan 17, 2022
Semi-Supervised Learning, Object Detection, ICCV2021

End-to-End Semi-Supervised Object Detection with Soft Teacher By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai,

Microsoft 789 Dec 27, 2022
Awesome Monocular 3D detection

Awesome Monocular 3D detection Paper list of 3D detetction, keep updating! Contents Paper List 2022 2021 2020 2019 2018 2017 2016 KITTI Results Paper

Zhikang Zou 184 Jan 04, 2023
Coded illumination for improved lensless imaging

CodedCam Coded Illumination for Improved Lensless Imaging Paper | Supplementary results | Data and Code are available. Coded illumination for improved

Computational Sensing and Information Processing Lab 1 Nov 29, 2021