A Structured Self-attentive Sentence Embedding

Last update: Nov 28, 2022

Overview

Structured Self-attentive sentence embeddings

Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR 2017: https://arxiv.org/abs/1703.03130 .

USAGE:

For binary sentiment classification on imdb dataset run : python classification.py "binary"

For multiclass classification on reuters dataset run : python classification.py "multiclass"

You can change the model parameters in the model_params.json file Other tranining parameters like number of attention hops etc can be configured in the config.json file.

If you want to use pretrained glove embeddings , set the use_embeddings parameter to "True" ,default is set to False. Do not forget to download the glove.6B.50d.txt and place it in the glove folder.

Implemented:

Classification using self attention
Regularization using Frobenius norm
Gradient clipping
Visualizing the attention weights

Instead of pruning ,used averaging over the sentence embeddings.

Visualization:

After training, the model is tested on 100 test points. Attention weights for the 100 test data are retrieved and used to visualize over the text using heatmaps. A file visualization.html gets saved in the visualization/ folder after successful training. The visualization code was provided by Zhouhan Lin (@hantek). Many thanks.

Below is a shot of the visualization on few datapoints.

Training accuracy 93.4% Tested on 1000 points with 90.2% accuracy

A Structured Self-attentive Sentence Embedding

Related tags

Overview

Structured Self-attentive sentence embeddings

USAGE:

Implemented:

Visualization:

Owner

Kaushal Shetty

Multilingual word vectors in 78 languages

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

Nateve compiler developed with python.

A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

Translate U is capable of translating the text present in an image from one language to the other.

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

SimBERT升级版（SimBERTv2）！

使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Google and Stanford University released a new pre-trained model called ELECTRA

A list of NLP(Natural Language Processing) tutorials

An ActivityWatch watcher to pose questions to the user and record her answers.

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

GSoC'2021 | TensorFlow implementation of Wav2Vec2

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Implementation of "Adversarial purification with Score-based generative models", ICML 2021