Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Last update: Dec 23, 2022

Overview

ITTR - Pytorch

Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image Translation using Transformers.

Install

$ pip install ITTR-pytorch

Usage

They had 9 blocks of Hybrid Perception Block (HPB) in the paper

import torch
from ITTR_pytorch import HPB

block = HPB(
    dim = 512,              # dimension
    dim_head = 32,          # dimension per attention head
    heads = 8,              # number of attention heads
    attn_height_top_k = 16, # number of top indices to select along height, for the attention pruning
    attn_width_top_k = 16,  # number of top indices to select along width, for the attention pruning
    attn_dropout = 0.,      # attn dropout
    ff_mult = 4,            # expansion factor of feedforward
    ff_dropout = 0.         # feedforward dropout
)

fmap = torch.randn(1, 512, 32, 32)

out = block(fmap) # (1, 512, 32, 32)

You can also use the dual-pruned self-attention as so

import torch
from ITTR_pytorch import DPSA

attn = DPSA(
    dim = 512,         # dimension
    dim_head = 32,     # dimension per attention head
    heads = 8,         # number of attention heads
    height_top_k = 48, # number of top indices to select along height, for the attention pruning
    width_top_k = 48,  # number of top indices to select along width, for the attention pruning
    dropout = 0.       # attn dropout
)

fmap = torch.randn(1, 512, 32, 32)

out = attn(fmap) # (1, 512, 32, 32)

Citations

@inproceedings{Zheng2022ITTRUI,
  title   = {ITTR: Unpaired Image-to-Image Translation with Transformers},
  author  = {Wanfeng Zheng and Qiang Li and Guoxin Zhang and Pengfei Wan and Zhongyuan Wang},
  year    = {2022}
}

Comments

The image size and channel number of each layer of network are discussed

Hello！Thank you very much for publishing the core code of ITTR. But I would like to know the number of channels and image size of each layer, thank you very much！

opened by SHNsunhenan 0

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

SilkyArcTool English Dual languaged (rus+eng) GUI tool for packing and unpacking archives of Silky Engine. It is not the same arc as used in Ai6WIN. I

5 Sep 15, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

Graph2Pix: A Graph-Based Image to Image Translation Framework Installation Install the dependencies in env.yml $ conda env create -f env.yml $ conda a

18 Nov 17, 2022

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

normalizer This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch

23 Nov 30, 2022

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

20 Dec 12, 2022

A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models Graformer (also named BridgeTransformer in t

22 Dec 14, 2022

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

211 Dec 28, 2022

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Related tags

Overview

ITTR - Pytorch

Install

Usage

Citations

You might also like...

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

A Paper List for Speech Translation

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Comments

The image size and channel number of each layer of network are discussed

Releases(0.0.4)

0.0.4(Apr 1, 2022)

0.0.3(Apr 1, 2022)

0.0.2(Apr 1, 2022)

0.0.1(Apr 1, 2022)

Owner

Phil Wang

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

Stuff related to Ben Eater's 8bit breadboard computer

Natural Language Processing with transformers

The entmax mapping and its loss, a family of sparse softmax alternatives.

Fast, DB Backed pretrained word embeddings for natural language processing.

Translates basic English sentences into the Huna language (hoo-NAH)

Official implementation of Meta-StyleSpeech and StyleSpeech

✔👉A Centralized WebApp to Ensure Road Safety by checking on with the activities of the driver and activating label generator using NLP.

Spooky Skelly For Python

NLP Overview

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

Chatbot with Pytorch, Python & Nextjs

Source code of the "Graph-Bert: Only Attention is Needed for Learning Graph Representations" paper

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含 自然语言处理各领域的 面试题积累。

STT for TorchScript is a port of Coqui STT based on DeepSpeech to PyTorch.

GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning

Basic yet complete Machine Learning pipeline for NLP tasks

本项目是作者们根据个人面试和经验总结出的自然语言处理(NLP)面试准备的学习笔记与资料，该资料目前包含自然语言处理各领域的面试题积累。