A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Last update: Nov 20, 2021

Overview

A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Abstract

In this paper sentiment analysis has been performed in order to evaluate the performance of XLNet on this particular task. XLNet is rather a ground-breaking network on language understanding which uses the perks of both autoregressive models and autoencoders. While BERT uses autoencoders and Transformers use autoregression, XLNet combines the aforementioned networks’ attributes in order to achieve higher performance in many NLP tasks, such as sentiment analysis, question answering, reading comprehension, natural language understanding etc. In this work we evaluate the XLNet model in several sentiment classification tasks in terms of accuracy and efficiency. The XLNet reaches state of the art results and outperforms BERT which is the previous state of the art model on natural language processing.

This was an assignment for the course of Deep learning in PhD program of National Technical Unicersity of Athens

Team composed of 3 persons
Runs has been made on HPC-ARIS through batch scripts
Course grade 10/10 (excellent)
Full report formatted as a paper in here
Code for 2 sentiment analysis tasks out of 3 (implemented by the author of this repo) in here
Data available here

A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Related tags

Overview

A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Abstract

This was an assignment for the course of Deep learning in PhD program of National Technical Unicersity of Athens

Owner

James Zaridis

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Transformer Based Korean Sentence Spacing Corrector

PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"

Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

The PyTorch based implementation of continuous integrate-and-fire (CIF) module.

Utilizing RBERT model for KLUE Relation Extraction task

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

LUKE -- Language Understanding with Knowledge-based Embeddings

Help you discover excellent English projects and get rid of disturbing by other spoken language

English loanwords in the world's languages

Chinese Named Entity Recognization (BiLSTM with PyTorch)

A python framework to transform natural language questions to queries in a database query language.

ProtFeat is protein feature extraction tool that utilizes POSSUM and iFeature.

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Legal text retrieval for python

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English