Advanced Methods in Natural Language Processing- Spring 2018

When: Tue, 13-16
Where: Orenstein 103
Instructor: Jonathan Berant
Graders: Ben Bogin (benb969), Mor Geva (mega.mor), Omri Koshorek (ko.omri), all at gmail.
Office hours: Coordinate by e-mail
Forum: Moodle

Summary of home assignment grades is here.
Final date for submitting projects is Sep 5th.
Grades for ex. 3 are up. We will publish a summary of late days soon.
Grades for ex. 5 are up.
Guidelines for the RST default project are here.
Guidelines and pointers for using Google Cloud Platform are here.
Grades for ex. 2 are up.
Grades for ex. 4 are up.
Please check out Moodle for announcement on next week presentations.
Please check out the Moodle for an announcement about final projects.
Welcome to the NLP class!

Natural Language Processing (NLP) aims to develop methods for processing, analyzing and understanding natural language. The goal of this class is to provide a thorough overview of modern methods in the field of Natural Language Processing. The class will not assume prior knowledge in NLP, and will mostly focus on methods from structured prediction and deep learning.

Machine learning is a prerequisite for this class. If you want to attend and did not take any machine learning class or something equivalent (MOOCs do not count), you should talk to the instructor. Some assignments will include writing code in Python, while in others you are free to choose any programming language.

Homework assignments: There will be 5 homework assignments that will constitute 50% of the final grade. Assignments should be submitted in triplets according to the instructions on the assignment. You get 5 late days throughout the semester and then it's 5 points per day per assignment.
Project: A final project will constitute 50% of the final grade. There will be one or more default projects, where we will define an end task and you will build a model from scratch and run empirical experiments. You will be judged on the soundess of your model, empirical results, writing of final report, and code.
You can also decide to do a research project, where you choose a research problem and attack it (see example projects from last year below). Research projects will also be judged on the choice of problem.
Projects will be done in groups of three-four and will be presented in the last two classes (10 min. per group). You will submit a 6-page double-column report that summarizes your findings around September. Every late day will cause a deduction of 3 points. For research projects you will be judged also on the originality and merit of the proposed research. For research projects it is possible and recommended to have one project jointly with the advanced machine learning class. Projects done joinly with the ML class will be expected to be of high quality, striving towards a publication.

Lecture notes: Michael Collins, Notes on statistical NLP
Textbook: Jurafsky and Martin, Speech and Language Processing: An Introduction to Natural Language Processing
Textbook: Smith, Linguistic structure prediction
Article: Yoav Goldberg's primer on NNs for NLP
Textbook: A Course in Machine Learning

Date	Topic	Reading	Comments
6/3	Introduction Word embeddings	word2vec, GloVe
13/3	Word embeddings	Embeddings as matrix factorization	Assign. 1 zipped code Grades
20/3	Language models neural LMs, FFNNs	Michael Collins' lecture notes, Neural embeddings, Backpropagation
10/4	Recurrent language models RNNs, LSTMs, GRUs more LSTM GRU TensorFlow tutorial	Training RNNs, LSTMs	Assign. 2 Grades
17/4	Tagging Log-linear models	Michael Collins' lecture notes, Michael Collins' HMM notes, Michael Collins' LLM notes, MEMMs, FAQ
24/4	Globally-normalized linear models, Deep learning for tagging	CRFs and label bias Globally vs. locally normalized models BiLSTM CRF for tagging	Assign. 3 Grades
1/5	Introduction to parsing CKY	PCFG lecture notes
8/5	Lexicalized PCFGs Syntactic parsing	Lexicalized PCFG lecture notes Ratnaparkhi, 97;Hall et al., 14; Shift-reduce parsing	Assign. 4 Grades Sol. 1 Sol. 2
15/5	RST Deep syntactic parsing Semantic parsing intro	Neural CRF parsing, Minimal span-based neural parser RNN grammars
22/5	Semantic parsing: Compositionality CCG Learning Parsing	Clarke et al., 2010, Liang et al., 2011, Artzi and Zettlemoyer, Berant et al., 2013, Berant and Liang, 2015	Assign. 5 Grades
29/5	Sequence to sequence	seq2seq, Attention, Pointer networks, Jia and Liang, 2016, Weak supervision, Guu et al, 2017
5/6	Weakly-supervised sequence to sequence modesl
12/6	Projects Coreference Relation extraction Reading comprehension Concluding remarks

Advanced Methods in Natural Language Processing – Spring 2018