Course materials for a 3-day seminar "Machine Learning and NLP: Advances and Applications" at New College of Florida

Overview

Machine Learning and NLP: Advances and Applications

This repository hosts the course materials used for a 3-day seminar "Machine Learning and NLP: Advances and Applications" as part of Independent Study Period 2020 at New College of Florida.

Note that the seminar was held in Jan 2020, and the content may be a little bit oudated (as of Feb 2022). Please also refer to a Fall 2021 full semester course "CIS6930 Topics in Computing for Data Science", which covers much wider (and a little bit newer) Deep Learning topics.

Syllabus

Course Description

This 3-day course provides students with an opportunity to learn Machine Learning and Natural Language Processing (NLP) from basics to applications. The course covers some state-of-the-art NLP techniques including Deep Learning. Each day consists of a lecture and a hands-on session to help students learn how to apply those techniques to real-world applications. During the hands-on session, students will be given assignments to develop programming code in Python. Three days are too short to fully understand the concepts that are covered by the course and learn to apply those techniques to actual problems. Students are strongly encouraged to complete reading assignments before the lecture to be ready for the course assignments, and bring a lot of questions to the course. :)

Learning Objectives

Students successfully completing the course will

  • demonstrate the ability to apply machine learning and natural language processing techniques to various types of problems.
  • demonstrate the ability to build their own machine learning models using Python libraries.
  • demonstrate the ability to read and understand research papers in ML and NLP.

Course Outline

  • Wed 1/22 Day 1: Machine Learning basics [Slides]

    • Machine learning examples
    • Problem formulation
    • Evaluation and hyper-parameter tuning
    • Data Processing basics with pandas
    • Machine Learning with scikit-learn
    • Hands-on material: [ipynb] Open In Colab
  • Thu 1/23 Day 2: NLP basics [Slides]

    • Unsupervised learning and visualization
    • Topic models
    • NLP basics with SpaCy and NLTK
    • Understanding NLP pipeline for feature extraction
    • Machine learning for NLP tasks (text classification, sequential tagging)
    • Hands-on material [ipynb] Open In Colab
    • Follow-up
      • Commonsense Reasoning (Winograd Schema Challenge)
  • Fri 1/24 Day 3: Advanced techniques and applications [Slides]

    • Basic Deep Learning techniques
    • Word embeddings
    • Advanced Deep Learning techniques for NLP
    • Problem formulation and applications to (non-)NLP tasks
    • Pre-training models: ELMo and BERT
    • Hands-on material: [ipynb] Open In Colab
    • Follow-up
      • The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time
      • Cross-lingual word/sentence embeddings

Reading Assignments & Recommendations:

The following online tutorials for students who are not familiar with the Python libraries used in the course. Each day will have a hands-on session that requires those libraries. Please do not expect to have enough time to learn how to use those libraries during the lecture.

The following list is a good starting point.

The course will cover the following papers as examples of (non-NLP) applications (probably in Day 3.) Students who'd like to learn how to apply Deep Learning techniques to your own problems are encouraged to read the following papers.

  • [1] A. Asai, S. Evensen, B. Golshan, A. Halevy, V. Li, A. Lopatenko, D. Stepanov, Y. Suhara, W.-C. Tan, Y. Xu, "HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments" Proc LREC 18, 2018. [Paper] [Dataset]
  • [2] S. Evensen, Y. Suhara, A. Halevy, V. Li, W.-C. Tan, S. Mumick, "Happiness Entailment: Automating Suggestions for Well-Being," Proc. ACII 2019, 2019. [Paper]
  • [3] Y. Suhara, Y. Xu, A. Pentland, "DeepMood: Forecasting Depressed Mood Based on Self-Reported Histories via Recurrent Neural Networks," Proc. WWW '17, 2017. [Paper]
  • [4] N. Bhutani, Y. Suhara, W.-C. Tan, A. Halevy, H. V. Jagadish, "Open Information Extraction from Question-Answer Pairs," Proc. NAACL-HLT 2019, 2019. [Paper]

Computing Resources:

The course requires students to write code:

  • Students are expected to have a personal computer at their disposal. Students should have a Python interpreter and the listed libraries installed on their machines.

The hands-on sessions will require the following Python libraries. Please install those libraries on your computer prior to the course. See also the reading assignment section for the recommended tutorials.

  • pandas
  • scikit-learn
  • gensim
  • spacy
  • nltk
  • torch (PyTorch)
Owner
Yoshi Suhara
Yoshi Suhara
Pylexa - Artificial Assistant made with Python

Pylexa - Artificial Assistant made with Python Alexa is a famous artificial assistant used massively across the world. It is a substitute of Alexa whi

\_PROTIK_/ 4 Nov 03, 2021
API moment - LussovAPI

LussovAPI TL;DR: py API container, pip install -r requirements.txt, example, main configuration Long version: Install Dependancies Download file requi

William Pedersen 1 Nov 30, 2021
Customisable coding font with alternates, ligatures and contextual positioning

Guide Ligature Support Links Log License Guide Live Preview + Download larsenwork.com/monoid Install Quit your editor/program. Unzip and open the fold

Andreas Larsen 7.6k Dec 30, 2022
Chemical Analysis Calculator, with full solution display.

Chemicology Chemical Analysis Calculator, to solve problems efficiently by displaying whole solution. Go to releases for downloading .exe, .dmg, Linux

Muhammad Moazzam 2 Aug 06, 2022
Google Foobar challenge solutions from my experience and other's on the web.

Google Foobar challenge Google Foobar challenge solutions from my experience and other's on the web. Note: Problems indicated with "Mine" are tested a

Islam Ayman 6 Jan 20, 2022
HogwartsRegister - A Hogwarts Register With Python

A Hogwarts Register Installation download code git clone https://github.com/haor

0 Feb 12, 2022
These are After Effects and Python files that were made in the process of creating the video for the contest.

spirograph These are After Effects and Python files that were made in the process of creating the video for the contest. In the python file you can qu

91 Dec 07, 2022
A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.

A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.

4 Dec 06, 2021
Intelligent Systems Project In Python

Intelligent Systems Project In Python

RLLAB 3 May 16, 2022
Скрипт позволяет выгрузить участников чатов/каналов(по чату для комментариев) и сообщения в различные форматы файлов.

TG-Parser Парсер участников и сообщений из ТГ-Чатов и чатов для комментариев в ТГ-Каналах Возможности Выгрузка участников групп/каналов(по чату для ко

50 Jan 06, 2023
Find functions without canary check (or similar)

Ghidra Check Protector Which non-trivial functions don't reference the stack canary checker (or other, user-defined function)? Place your cursor to th

buherator 3 Jan 17, 2022
Assignment for python course, BUPT 2021.

pyFuujinrokuDestiny Assignment for python course, BUPT 2021. Notice username and password must be ASCII encoding. If username exists in database, syst

Ellias Kiri Stuart 3 Jun 18, 2021
Auto check in via GitHub Actions

因为本人毕业离校,本项目交由在校的@hfut-xyc同学接手,请访问hfut-xyc/hfut_auto_check-in获得最新的脚本 本项目遵从GPLv2协定,Copyright (C) 2021, Fw[a]rd 免责声明 根据GPL协定,我、本项目的作者,不会对您使用这个脚本带来的任何后果

Fw[a]rd 3 Jun 27, 2021
A simple Programming Language

R.S.O.C. A custom built programming language About The Project R.S.O.C. is a custom built programming language very similar to a low-level 8085 progra

Ravi Maurya 17 Sep 13, 2022
Manually Install Python 2.7 pip without any problem !

Python2.7_install_pip Manually Install Python 2.7 pip without any problem ! Download installPip.py to your system and Run the code using this Command

Ali Jafari 1 Dec 09, 2021
Small tool to use hero .json files created with Optolith for The Dark Eye/ Das Schwarze Auge 5 to perform talent probes.

DSA5-ProbeMaker A little tool for The Dark Eye 5th Edition (Das Schwarze Auge 5) to load .json from Optolith character generation and easily perform t

2 Jan 06, 2022
Python Cheat Sheet

Introduction Pysheeet was created with intention of collecting python code snippets for reducing coding hours and making life easier and faster. Any c

CHANG-NING TSAI 7.5k Dec 30, 2022
Ultimate Score Server for RealistikOsu

USSR Ultimate Score Server for RealistikOsu (well not just us but it makes the acronym work.) Also I wonder how long this name will last. What is this

RealistikOsu! 15 Dec 14, 2022
Weblate is a copylefted libre software web-based continuous localization system

Weblate is a copylefted libre software web-based continuous localization system, used by over 2500 libre projects and companies in more than 165 count

Weblate 7 Dec 15, 2022
A class to draw curves expressed as L-System production rules

A class to draw curves expressed as L-System production rules

Juna Salviati 6 Sep 09, 2022