A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Overview

Resources to Help Global Equality for PhDs in NLP / AI

This repo originates with a wish to promote Global Equality for people who want to do a PhD in NLP, following the idea that mentorship programs are an effective way to fight against segregation, according to The Human Networks (Jackson, 2019). Specifically, we wish people from all over the world and with all types of backgrounds can share the same source of information, so that success will be a reward to those who are determined and hardworking, regardless of external contrainsts.

One non-negligible reason for success is access to information, such as (1) knowing what a PhD in NLP is like, (2) knowing what top grad schools look for when reviewing PhD applications, (3) broadening your horizon of what is good work, (4) knowing how careers in NLP in both academia and industry are like, and many others.

Contributor: Zhijing Jin (PhD student in NLP at Max Planck Institute, co-organizer of the ACL Year-Round Mentorship Program).

You are welcome to be a collaborator, -- you can make an issue/pull request, and I can add you :).

Endorsers of this repo: Prof Rada Mihalcea (University of Michigan). Please add your name here (by a pull request) if you endorse this repo :).

Contents (Actively Updating)

Top Resources

  1. Online ACL Year-Round Mentorship Program: https://acl-mentorship.github.io (You can apply as a mentee, as a mentor, or as a volunteer. For mentees, you will be able to attend monthly zoom Q&A sessions hosted senior researchers in NLP. You will also join a global slack channel, where you can constantly post your questions, and we will collect answers from senior NLP researchers.)

Stage 1. (Non-PhD -> PhD) How to Apply to PhD?

  1. (Prof Philip [email protected]) Finding CS Ph.D. programs to apply to. [Video]

  2. (Prof Mor Harchol-Balter@CMU) Applying to Ph.D. Programs in Computer Science (2014). [Guide]

  3. (Prof Jason [email protected]) Advice for Research Students (last updated: 2021). [List of suggestions]

  4. (CS Rankings) Advice on Applying to Grad School in Computer Science. [Pointers]

  5. (Nelson Liu, [email protected]) Student Perspectives on Applying to NLP PhD Programs (2019). [Suggestions Based on Surveys]

  6. A Princeton CS Major's Guide to Applying to Graduate School. [List of suggestions]

  7. (John Hewitt, [email protected]) Undergrad to PhD, or not - advice for undergrads interested in research (2018). [Suggestions]

  8. (Kalpesh Krishna, [email protected] Amherst) Grad School Resources (2018). [Article] (This list lots of useful pointers!)

  9. (Prof Scott E. [email protected]) Quora answers on the LTI program at CMU (2017). [Article]

  10. (Albert Webson et al., [email protected] University) Resources for Underrepresented Groups, including Brown's Own Applicant Mentorship Program (2020, but we will keep updating it throughout the 2021 application season.) [List of Resources]

Specific Suggestions

  1. (Prof Nathan [email protected] University) Inside Ph.D. admissions: What readers look for in a Statement of Purpose. [Article]

Improve Your Proficiency with Tools

  1. (MIT 2020) The Missing Semester of Your CS Education (e.g., master the command-line, ssh into remote machines, use fancy features of version control systems).

Stage 2. (Doing PhD) How to Succeed in PhD?

  1. (Maxwell Forbes, [email protected]) Every PhD Is Different. [Suggestions]

  2. (Prof Mark [email protected], Prof Hanna M. [email protected] Amherst) How to be a successful PhD student (in computer science (in NLP/ML)). [Suggestions]

  3. (Andrej Karpathy) A Survival Guide to a PhD (2016). [Suggestions]

  4. (Prof Kevin [email protected]) Kevin Gimpel's Advice to PhD Students. [Suggestions]

  5. (Prof Marie [email protected] University) How to Succeed in Graduate School: A Guide for Students and Advisors (1994). [Article] [Part II]

  6. (Prof Eric [email protected]) Syllabus for Eric’s PhD students (incl. Prof's expectation for PhD students). [syllabus]

  7. (Prof H.T. [email protected]) Useful Thoughts about Research (1987). [Suggestions]

  8. (Prof Phil [email protected]) Networking on the Network: A Guide to Professional Skills for PhD Students (last updated: 2015). [Suggestions]

  9. (Prof Stephen C. [email protected]) Some Modest Advice for Graduate Students. [Article]

  10. (Prof Tao [email protected]) Graduate Student Survival/Success Guide. [Slides]

  11. (Mu [email protected]) 博士这五年 (A Chinese article about five years in PhD at CMU). [Article]

  12. (Karl Stratos) A Note to a Prospective Student. [Suggestions]

What Is Weekly Meeting with Advisors like?

  1. (Prof Jason [email protected]) What do PhD students talk about in their once-a-week meetings with their advisers during their first year? (2015). [Article]

  2. (Brown University) Guide to Meetings with Your Advisor. [Suggestions]

Practical Guides

  1. (Prof Srinivasan [email protected]) How to Read a Paper (2007). [Suggestions]

  2. (Prof Jason [email protected]) How to Read a Technical Paper (2009). [Suggestions]

  3. (Prof Jason [email protected]) How to write a paper? (2010). [Suggestions]

Memoir-Like Narratives

  1. (Prof Philip [email protected]) The Ph.D. Grind: A Ph.D. Student Memoir (last updated: 2015). [Video] (For the book, you have to dig deeply, and then you will find the book.)

  2. (Prof Tianqi [email protected]) 陈天奇:机器学习科研的十年 (2019) (A Chinese article about ten years of research in ML). [Article]

  3. (Jean Yang) What My PhD Was Like. [Article]

How to Excel Your Research

  1. The most important step: (Prof Jason [email protected]) How to Find Research Problems (1997). [Suggestions]

Grad School Fellowships

  1. (List compiled by CMU) Graduate Fellowship Opportunities [link]
  2. CYD Fellowship for Grad Students in Switzerland [link]

Other Books

  1. The craft of Research by Wayne Booth, Greg Colomb and Joseph Williams.

  2. How to write a better thesis by Paul Gruba and David Evans

  3. Helping Doctoral Students to write by Barbara Kamler and Pat Thomson

  4. The unwritten rules of PhD research by Marian Petre and Gordon Rugg

Stage 3. (After PhD -> Industry) How is life as an industry researcher?

  1. (Mu [email protected]) 工作五年反思 (A Chinese article about reflections on the five years working in industry). [Article]

Stage 4. (Being a Prof) How to get an academic position? And how to be a good prof?

  1. (Prof Jason [email protected]) How to write an academic research statement (when applying for a faculty job) (2017). [Article]

  2. (Prof Jason [email protected]) How to Give a Talk (2015). [Suggestions]

  3. (Prof Jason [email protected]) Teaching Philosophy. [Article]

Stage 5. (Whole Career Path) How to live out a life career as an NLP research?

  1. (Prof Charles [email protected] University, Prof Qiang [email protected])Crafting Your Research Future: A Guide to Successful Master's and Ph.D. Degrees in Science & Engineering. [Book]

Further Readings: Technical Materials to Improve Your NLP Research Skills

  1. (Prof Jason [email protected]) Technical Tutorials, Notes, and Suggested Reading (last updated: 2018) [Reading list]

Contributions

All types of contributions to this resource list is welcome. Feel free to open a Pull Request.

Contact: Zhijing Jin, PhD in NLP at Max Planck Institute for Intelligent Systems, working on NLP & Causality.

How to Cite This Repo

@misc{resources2021jin,
  author = {Zhijing Jin},
  title = {Resources to Help Global Equality for PhDs in NLP},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/zhijing-jin/nlp-phd-global-equality}}
}
Owner
PhD in NLP & Causality. Affiliated with Max Planck Institute, Germany & ETH & UMich. Supervised by Bernhard Schoelkopf, Rada Mihalcea, and Mrinmaya Sachan.
Simple telegram bot to convert files into direct download link.you can use telegram as a file server 🪁

TGCLOUD 🪁 Simple telegram bot to convert files into direct download link.you can use telegram as a file server 🪁 Features Easy to Deploy Heroku Supp

Mr.Acid dev 6 Oct 18, 2022
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。

ttskit Text To Speech Toolkit: 语音合成工具箱。 安装 pip install -U ttskit 注意 可能需另外安装的依赖包:torch,版本要求torch=1.6.0,=1.7.1,根据自己的实际环境安装合适cuda或cpu版本的torch。 ttskit的

KDD 483 Jan 04, 2023
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).

For better performance, you can try NLPGNN, see NLPGNN for more details. BERT-NER Version 2 Use Google's BERT for named entity recognition (CoNLL-2003

Kaiyinzhou 1.2k Dec 26, 2022
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

20.5k Jan 08, 2023
Natural Language Processing with transformers

we want to create a repo to illustrate usage of transformers in chinese

Datawhale 763 Dec 27, 2022
Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

BiQQLSTM_HS Code and data for paper: Title: An Effective, Robust and Fairness-awareHate Speech Detection Framework. Authors: Guanyi Mou and Kyumin Lee

Guanyi Mou 2 Dec 27, 2022
Implementation of legal QA system based on SentenceKoBART

LegalQA using SentenceKoBART Implementation of legal QA system based on SentenceKoBART How to train SentenceKoBART Based on Neural Search Engine Jina

Heewon Jeon(gogamza) 75 Dec 27, 2022
Final Project Bootcamp Zero

The Quest (Pygame) Descripción Este es el repositorio de código The-Quest para el proyecto final Bootcamp Zero de KeepCoding. El juego consiste en la

Seven-z01 1 Mar 02, 2022
RIDE automatically creates the package and boilerplate OOP Python node scripts as per your needs

RIDE: ROS IDE RIDE automatically creates the package and boilerplate OOP Python code for nodes as per your needs (RIDE is not an IDE, but even ROS isn

Jash Mota 20 Jul 14, 2022
Black for Python docstrings and reStructuredText (rst).

Style-Doc Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python

Telekom Open Source Software 13 Oct 24, 2022
FewCLUE: 为中文NLP定制的小样本学习测评基准

FewCLUE: 为中文NLP定制的小样本学习测评基准

CLUE benchmark 387 Jan 04, 2023
Use PaddlePaddle to reproduce the paper:mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

MT5_paddle Use PaddlePaddle to reproduce the paper:mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer English | 简体中文 mT5: A Massively

2 Oct 17, 2021
Unsupervised Language Modeling at scale for robust sentiment classification

** DEPRECATED ** This repo has been deprecated. Please visit Megatron-LM for our up to date Large-scale unsupervised pretraining and finetuning code.

NVIDIA Corporation 1k Nov 17, 2022
Python library for parsing resumes using natural language processing and machine learning

CVParser Python library for parsing resumes using natural language processing and machine learning. Setup Installation on Linux and Mac OS Follow the

nafiu 0 Jul 29, 2021
Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

XLM-EMO: Multilingual Emotion Prediction in Social Media Text Abstract Detecting emotion in text allows social and computational scientists to study h

MilaNLP 35 Sep 17, 2022
Code for text augmentation method leveraging large-scale language models

HyperMix Code for our paper GPT3Mix and conducting classification experiments using GPT-3 prompt-based data augmentation. Getting Started Installing P

NAVER AI 47 Dec 20, 2022
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti

Sidharth Pal 20 Dec 14, 2022
This library is testing the ethics of language models by using natural adversarial texts.

prompt2slip This library is testing the ethics of language models by using natural adversarial texts. This tool allows for short and simple code and v

9 Dec 28, 2021
Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

THUNLP-MT 9 Jun 27, 2022
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

Hiroki Nakayama 1.5k Dec 05, 2022