Advanced Methods in Natural Language Processing – Spring 2019

When: Tue, 13-16
Where: Dan David 001
Instructor: Jonathan Berant
Graders: Ohad Rubin (iohadrubin), Omri Koshorek (ko.omri), all at gmail.
Office hours: Coordinate by e-mail
Forum: Moodle

News
Overview

Natural Language Processing (NLP) aims to develop methods for processing, analyzing and understanding natural language. The goal of this class is to provide a thorough overview of modern methods in the field of Natural Language Processing. The class will not assume prior knowledge in NLP.

Prerequisites

Machine learning is a prerequisite for this class. If you want to attend and did not take any machine learning class or something equivalent (MOOCs do not count), you should talk to the instructor. Some assignments will include writing code in Python, while in others you are free to choose any programming language.

Grading
  1. Homework assignments: There will be 5 homework assignments that will constitute 50% of the final grade. Assignments should be submitted in pairs according to the instructions on the assignment. Copying in an assignment will result in a grade of zero for that assignment. You get 5 late days throughout the semester and then it's 5 points deduction per day per assignment.
  2. Project: A final project will constitute 50% of the final grade. There will be a default project, where we will define an end task and you will build a model from scratch and run empirical experiments. You will be judged on the soundess of your model, empirical results, writing of final report, and code.
    You can also decide to do a research project, where you choose a research problem and attack it (see example projects from last year below). Research projects will also be judged on the choice of problem and creativity.
    Projects will be done in groups of 2-3 students. An outline of the planned project will be presented in the final class (5 min. per group). You will submit a 6-page double-column report that summarizes your findings around September. Every late day will cause a deduction of 3 points. For research projects it is possible and recommended to have one project jointly with the advanced machine learning class. Projects done joinly with the ML class will be expected to be of high quality, striving towards a publication. All students in a group must be enrolled in the NLP and advanced ML classes to have a joint project.
Recommended reading
Tentative schedule

Date Topic Reading Comments
5/3 Introduction
Word embeddings
word2vec, GloVe
12/3 Word embeddings Embeddings as matrix factorization Assign. 1
Grades
19/3 Language models
n-gram language models, perplexity, feed-forward LM, RNN LM
Michael Collins' lecture notes, Neural embeddings, Backpropagation
26/3 Language models
more LSTM GRU
Vanishing gradient, LSTMs, GRUs
Contextualized word representations
Training RNNs, LSTMs, Limits of language modeling, ELMO, BERT, Intro to contextualized word representations
2/4 Tagging
Log-linear models
Michael Collins' lecture notes, Michael Collins' HMM notes, Michael Collins' LLM notes, MEMMs, FAQ Assign. 2
Grades
30/4 Globally-normalized linear models, Deep learning for tagging CRFs and label bias
Globally vs. locally normalized moels
BiLSTM CRF for tagging
Assign. 3
Grades
7/5 Introduction to parsing
CKY
PCFG lecture notes
14/5 Lexicalized PCFGs
Syntactic parsing
Lexicalized PCFG lecture notes
Ratnaparkhi, 97;Hall et al., 14;
Shift-reduce parsing
Assign. 4
Grades
21/5 Deep syntactic parsing
Sequence to sequence
seq2seq, Attention, Pointer networks, Jia and Liang, 2016
28/5 Semantic parsing introduction
semantic parsing compositionality
semantic parsing CCG
semantic parsing inference
semantic parsing learning
semantic parsing decoding
Weak supervision, Guu et al, 2017
4/6 Projects
11/6 Wrap-ups: semantic parsing inference (slides 1-15)
Constrained decoding (slides 21-40)
Subword models
Concluding remarks

Research projects from 2018
Research projects from 2017