Advanced Methods in Natural Language Processing – Spring 2018

When: Tue, 13-16
Where: Orenstein 103
Instructor: Jonathan Berant
Graders: Ben Bogin (benb969), Mor Geva (mega.mor), Omri Koshorek (ko.omri), all at gmail.
Office hours: Coordinate by e-mail
Forum: Moodle


Natural Language Processing (NLP) aims to develop methods for processing, analyzing and understanding natural language. The goal of this class is to provide a thorough overview of modern methods in the field of Natural Language Processing. The class will not assume prior knowledge in NLP, and will mostly focus on methods from structured prediction and deep learning.


Machine learning is a prerequisite for this class. If you want to attend and did not take any machine learning class or something equivalent (MOOCs do not count), you should talk to the instructor. Some assignments will include writing code in Python, while in others you are free to choose any programming language.

  1. Homework assignments: There will be 5 homework assignments that will constitute 50% of the final grade. Assignments should be submitted in triplets according to the instructions on the assignment. You get 5 late days throughout the semester and then it's 5 points per day per assignment.
  2. Project: A final project will constitute 50% of the final grade. There will be one or more default projects, where we will define an end task and you will build a model from scratch and run empirical experiments. You will be judged on the soundess of your model, empirical results, writing of final report, and code.
    You can also decide to do a research project, where you choose a research problem and attack it (see example projects from last year below). Research projects will also be judged on the choice of problem.
    Projects will be done in groups of three-four and will be presented in the last two classes (10 min. per group). You will submit a 6-page double-column report that summarizes your findings around September. Every late day will cause a deduction of 3 points. For research projects you will be judged also on the originality and merit of the proposed research. For research projects it is possible and recommended to have one project jointly with the advanced machine learning class. Projects done joinly with the ML class will be expected to be of high quality, striving towards a publication.
Recommended reading
Tentative schedule

Date Topic Reading Comments
6/3 Introduction
Word embeddings
word2vec, GloVe
13/3 Word embeddings Embeddings as matrix factorization Assign. 1
zipped code
20/3 Language models
neural LMs, FFNNs, RNNs
Michael Collins' lecture notes, Neural embeddings, Backpropagation, Training RNNs
10/4 Tagging
Log-linear models
Michael Collins' HMM notes, Michael Collins' LLM notes, MEMMs, FAQ Assign. 2
17/4 Global linear models Michael Collins' lecture notes, CRFs and label bias
24/4 Syntax, grammars
Lexicalized PCFGs
PCFG lecture notes, Lexicalized PCFG lecture notes Assign. 3
1/5 Discriminative models for parsing Ratnaparkhi, 97;Hall et al., 14;Neural CRF parsing
Shift-reduce parsing
8/5 Semantic parsing Compositionality (Liang and Potts), Artzi et al. tutorial Assign. 4
15/5 Semantic parsing:
Clarke et al., 2010, Liang et al., 2011, Artzi and Zettlemoyer, Berant et al., 2013, Berant and Liang, 2015
22/5 Sequence to sequence
LSTM, seq2seq, GRU, Attention, Pointer networks, Jia and Liang, 2016, Weak supervision, Guu et al, 2017 Assign. 5
29/5 RL, Tree-RNN
CVG, CCG with guarantees
5/6 Projects
12/6 Projects

Research projects from last year