A Transformer Implementation that is easy to understand and customizable.

Last update: Jan 20, 2022

Overview

Simple Transformer

I've written a series of articles on the transformer architecture and language models on Medium.

This repository contains an implementation of the Transformer architecture presented in the paper Attention Is All You Need by Ashish Vaswani, et. al.

My goal is to write an implementation that is easy to understand and dig into nitty-gritty details where the devil is.

Python environment

You can use any Python virtual environment like venv and conda.

For example, with venv:

python3 -m venv venv
source venv/bin/activate

pip install --upgrade pip
pip install -e.

Spacy Tokenizer Data Preparation

To use Spacy's tokenizer, make sure to download required languages.

For example, English and Germany tokenizers can be downloaded as below:

python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Text Data from Torchtext

This project uses text datasets from Torchtext.

from torchtext import datasets

The default configuration uses Multi30k dataset.

Training

python train.py config_path

The default config path is config/config.yaml.

It is possible to resume training from a checkpoint.

python train.py --checkpoint_path runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

You can run tensorboard to see the training progress.

tensorboard --logdir=runs

The logs are created under runs.

Test

python test.py checkpoint_path

Example,

python test.py runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

config.yaml is copied to the model folder when training starts, and the test.py assumes the existence of a config yaml file.

Unit tests

There are some unit tests in the tests folder.

pytest tests

A Transformer Implementation that is easy to understand and customizable.

Related tags

Overview

Simple Transformer

Python environment

Spacy Tokenizer Data Preparation

Text Data from Torchtext

Training

Test

Unit tests

References:

Owner

Naoki Shibuya

Automatically search Stack Overflow for the command you want to run

KR-FinBert And KR-FinBert-SC

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021

InferSent sentence embeddings

👑 spaCy building blocks and visualizers for Streamlit apps

Training code for Korean multi-class sentiment analysis

Fully featured implementation of Routing Transformer

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Mednlp - Medical natural language parsing and utility library

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

This is a NLP based project to extract effective date of the contract from their text files.

Espial is an engine for automated organization and discovery of personal knowledge

A Python/Pytorch app for easily synthesising human voices

基于pytorch+bert的中文事件抽取

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Py65 65816 - Add support for the 65C816 to py65

Material for GW4SHM workshop, 16/03/2022.

Sploitus - Command line search tool for sploitus.com. Think searchsploit, but with more POCs