Natural Language Processing

Here you will find the teaching materials for the "Natural Language Processing" course at EDHEC Business School, 2022

What is the course about?

The course is designed as an introduction to the basics of natural language processing for analyzing unstructured, user-generated content. It is for beginners to the topic (and NLP in general), but it will be helpful to have basic knowledge of Python and a familarity with data science techniques.

Topics covered include:

text preprocessing in Python,
collecting your own data from Twitter and Reddit,
content analysis,
text embeddings, and
supervised learning with text data.

What materials are available here?

The sildes will be posted on the course BlackBoard page. They mostly serve as a high-level introduction to the examples and exercies (in Colab notebooks), which are linked to from the slides themselves. Copies of the Colab notebooks can also be found in the folder called /colab in this repository.

Can I work through the material on my own?

If you didn't attend the class, you can certainly work through the materials on your own (the Colab notebooks are designed to be readable and doable for individuals working at their own pace). The slides posted on BlackBoard will guide you through the content. The notebooks are intendend to be worked through in order. Each one will have examples to view and 1 or 2 practice exercises to complete.

Aknowledgements

I would like to aknowledge Steve Wilson at Oakland University for making his DS3 workshop materials publically available with an MIT license.

Natural Language Processing at EDHEC, 2022

Related tags

Overview

Natural Language Processing

What is the course about?

What materials are available here?

Can I work through the material on my own?

Aknowledgements

Owner

This repository contains helper functions which can help you generate additional data points depending on your NLP task.

结巴中文分词

Ελληνικά νέα (Python script) / Greek News Feed (Python script)

A BERT-based reverse dictionary of Korean proverbs

nlp基础任务

Score-Based Point Cloud Denoising (ICCV'21)

Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

Spooky Skelly For Python

Pytorch-Named-Entity-Recognition-with-BERT

Neural network sequence labeling model

Code voor mijn Master project omtrent VideoBERT

⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡

FactSumm: Factual Consistency Scorer for Abstractive Summarization

Search-Engine - 📖 AI based search engine

Graphical user interface for Argos Translate

A high-level Python library for Quantum Natural Language Processing

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search