A list of NLP(Natural Language Processing) tutorials

Last update: Dec 25, 2022

Overview

NLP Tutorial

A list of NLP(Natural Language Processing) tutorials built on PyTorch.

A step-by-step tutorial on how to implement and adapt to the simple real-word NLP task.

Text Classification

News Category Classification

This repo provides a simple PyTorch implementation of Text Classification, with simple annotation. Here we use Huffpost news corpus including corresponding category. The classification model trained on this dataset identify the category of news article based on their headlines and descriptions.
Keyword: CBoW, LSTM, fastText, Text cateogrization

IMDb Movie Review Classification

This text classification tutorial trains a transformer model on the IMDb movie review dataset for sentiment analysis. It provides a simple PyTorch implementation, with simple annotation.
Keyword: Transformer, Sentiment analysis

Question-Answer Matching

This repo provides a simple PyTorch implementation of Question-Answer matching. Here we use the corpus from Stack Exchange to build embeddings for entire questions. Using those embeddings, we find similar questions for a given question, and show the corresponding answers to those I found.
Keyword: CBoW, TF-IDF, LSTM with variable-length seqeucnes

Movie Review Classification (Korean NLP)

This repo provides a simple Keras implementation of TextCNN for Text Classification. Here we use the movie review corpus written in Korean. The model trained on this dataset identify the sentiment based on review text.
Keyword: TextCNN, Sentiment analysis

Neural Machine Translation

English to French Translation - seq2seq

This neural machine translation tutorial trains a seq2seq model on a set of many thousands of English to French translation pairs to translate from English to French. It provides an intrinsic/extrinsic comparison of various sequence-to-sequence (seq2seq) models in translation.
Keyword: sequence to seqeunce network(seq2seq), Attention, Autoregressive, Teacher-forcing

French to English Translation - Transformer

This neural machine translation tutorial trains a Transformer model on a set of many thousands of French to English translation pairs to translate from French to English. It provides a simple PyTorch implementation, with simple annotation.
Keyword: Transformer, SentencePiece

Natural Language Understanding

Neural Language Model

This repo provides a simple PyTorch implementation of Neural Language Model for natural language understanding. Here we implement unidirectional/bidirectional language models, and pre-train language representations from unlabeled text (Wikipedia corpus).
Keyword: Autoregressive language model, Perplexity

A list of NLP(Natural Language Processing) tutorials

Related tags

Overview

NLP Tutorial

Table of Contents

Text Classification

News Category Classification

IMDb Movie Review Classification

Question-Answer Matching

Movie Review Classification (Korean NLP)

Neural Machine Translation

English to French Translation - seq2seq

French to English Translation - Transformer

Natural Language Understanding

Neural Language Model

Owner

Allen Lee

TensorFlow code and pre-trained models for BERT

Open-World Entity Segmentation

Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

A minimal code for fairseq vq-wav2vec model inference.

Plugin repository for Macast

Telegram bot to auto post messages of one channel in another channel as soon as it is posted, without the forwarded tag.

Black for Python docstrings and reStructuredText (rst).

AIDynamicTextReader - A simple dynamic text reader based on Artificial intelligence

Include MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Open solution to the Toxic Comment Classification Challenge

HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

Codes for processing meeting summarization datasets AMI and ICSI.

Outreachy TFX custom component project

A deep learning-based translation library built on Huggingface transformers

Weakly-supervised Text Classification Based on Keyword Graph

PortaSpeech - PyTorch Implementation

Synthetic data for the people.

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

A versatile token stream for handwritten parsers.