Natural Language Processing – Fall 2019/20

When: Wed, 9-12
Where: Dach 005
Instructor: Jonathan Berant
Grader: Ohad Rubin (iohadrubin at gmail)
TAs (assignment authors): Avia Efrat (avia.efrat at gmail), Elad Segal (eladsegal) at gmail)
Office hours: Coordinate by e-mail
Forum: Moodle

Overview

Natural Language Processing (NLP) aims to develop methods for processing, analyzing and understanding natural language. The goal of this class is to provide a thorough overview of modern methods in the field of Natural Language Processing. The class will not assume prior knowledge in NLP.

Prerequisites

Introduction to Machine learning is a prerequisite for this class. Because this class moved to the fall semester I will allow this year to take it in parallel, but students will have to make up for the lost material on their own. If you want to attend and did not take any machine learning class or something equivalent (MOOCs do not count), you should talk to the instructor. Assignments include writing code mostly in Python.

Grading
  1. Homework assignments: There will be five homework assignments that will constitute 50% of the final grade. Assignments should be submitted in pairs according to the instructions on the assignment. Copying in an assignment will result in a grade of zero for that assignment. You get 5 late days throughout the semester and then it's 5 points deduction per day per assignment.
  2. Project: A final project will constitute 50% of the final grade. There will be a default project that will be announced around the middle of the semester. Final submission date for project is April 21, 2020.
    You can also decide to do a research project, where you choose a research problem and attack it (see example projects from previous years below).
    Projects will be done in groups of two-three students. An outline of the planned project will be presented in the final class (5 min. per group). You will submit a 6-page double-column report (ACL/ICML style) that summarizes your findings around April. Every late day will cause a deduction of 3 points. For research projects it is possible and recommended to have one project jointly with the advanced machine learning class. Projects done joinly with the ML class will be expected to be of high quality, striving towards a publication. All students in a group must be enrolled in the NLP and advanced ML classes to have a joint project, If you want an exception talk to the instructors..
Recommended reading
Tentative schedule

Date Topic Reading Comments
30/10 Introduction
Word embeddings
Efficient Estimation of Word Representations in Vector Space, word2vec explained
6/11 Word embeddings SVD, Embedding as factorization
20/11 Language models
n-gram language models, perplexity, feed-forward LM, RNN LM
n-gram language models, GPT-2, Neural language model (FFNN), Training RNNs, Exploring the limits of language modeling
27/11 Language models
Vanishing gradient, LSTMs, GRUs, Transformers
Contextualized word representations
LSTM, GRU, Transformers, ELMO, BERT, Smith survey
4/12 Question answering
11/12 Tagging
Log-linear models
Transformation-based tagging, HMMs for NER, Viterbi, HMM for POS tagging
18/12 Log-linear models, Globally-normalized linear models, Deep learning for tagging MEMM for POS-tagging, MEMM for Information Extraction, CRFs, BiLSTM POS tagger, BiLSTM CRF for NER
1/1 Introduction to parsing
CKY
8/1 Lexicalized PCFGs
Syntactic parsing
Lexicalized PCFGs Charniak, Lexicalized PCFGs Collins, More features less grammar, Shift-reduce parsing, Neural CRF, Span-based parsing, Span-based parsing with pre-trained representations
10/1 Sequence-to-sequence models seq2seq, attention, Seq2seq semantic parsing, Pointer networks
15/1 Attention, Pointer networks, Semantic parsing
QDMR
22/1 Subword models
summary
projects

Research projects from 2019
Research projects from 2018
Research projects from 2017