Advanced Methods in Natural Language Processing- Spring 2019

When: Tue, 13-16
Where: Dan David 001
Instructor: Jonathan Berant
Graders: Ohad Rubin (iohadrubin), Omri Koshorek (ko.omri), all at gmail.
Office hours: Coordinate by e-mail
Forum: Moodle

All announcements regarding the class will be made through Moodle

Natural Language Processing (NLP) aims to develop methods for processing, analyzing and understanding natural language. The goal of this class is to provide a thorough overview of modern methods in the field of Natural Language Processing. The class will not assume prior knowledge in NLP.

Machine learning is a prerequisite for this class. If you want to attend and did not take any machine learning class or something equivalent (MOOCs do not count), you should talk to the instructor. Some assignments will include writing code in Python, while in others you are free to choose any programming language.

Homework assignments: There will be 5 homework assignments that will constitute 50% of the final grade. Assignments should be submitted in pairs according to the instructions on the assignment. Copying in an assignment will result in a grade of zero for that assignment. You get 5 late days throughout the semester and then it's 5 points deduction per day per assignment.
Project: A final project will constitute 50% of the final grade. There will be a default project, where we will define an end task and you will build a model from scratch and run empirical experiments. You will be judged on the soundess of your model, empirical results, writing of final report, and code.
You can also decide to do a research project, where you choose a research problem and attack it (see example projects from last year below). Research projects will also be judged on the choice of problem and creativity.
Projects will be done in groups of 2-3 students. An outline of the planned project will be presented in the final class (5 min. per group). You will submit a 6-page double-column report that summarizes your findings around September. Every late day will cause a deduction of 3 points. For research projects it is possible and recommended to have one project jointly with the advanced machine learning class. Projects done joinly with the ML class will be expected to be of high quality, striving towards a publication. All students in a group must be enrolled in the NLP and advanced ML classes to have a joint project.

Lecture notes: Michael Collins, Notes on statistical NLP
Textbook: Jurafsky and Martin, Speech and Language Processing: An Introduction to Natural Language Processing
Textbook: Smith, Linguistic structure prediction
Textbook: Goldberg, Neural Network Methods for Natural Language Processing
Textbook: A Course in Machine Learning

Date	Topic	Reading	Comments
5/3	Introduction Word embeddings	word2vec, GloVe
12/3	Word embeddings	Embeddings as matrix factorization	Assign. 1 Grades
19/3	Language models n-gram language models, perplexity, feed-forward LM, RNN LM	Michael Collins' lecture notes, Neural embeddings, Backpropagation
26/3	Language models more LSTM GRU Vanishing gradient, LSTMs, GRUs Contextualized word representations	Training RNNs, LSTMs, Limits of language modeling, ELMO, BERT, Intro to contextualized word representations
2/4	Tagging Log-linear models	Michael Collins' lecture notes, Michael Collins' HMM notes, Michael Collins' LLM notes, MEMMs, FAQ	Assign. 2 Grades
30/4	Globally-normalized linear models, Deep learning for tagging	CRFs and label bias Globally vs. locally normalized moels BiLSTM CRF for tagging	Assign. 3 Grades
7/5	Introduction to parsing CKY	PCFG lecture notes
14/5	Lexicalized PCFGs Syntactic parsing	Lexicalized PCFG lecture notes Ratnaparkhi, 97;Hall et al., 14; Shift-reduce parsing	Assign. 4 Grades
21/5	Deep syntactic parsing Sequence to sequence	seq2seq, Attention, Pointer networks, Jia and Liang, 2016
28/5	Semantic parsing introduction semantic parsing compositionality semantic parsing CCG semantic parsing inference semantic parsing learning semantic parsing decoding	Weak supervision, Guu et al, 2017
4/6	Projects
11/6	Wrap-ups: semantic parsing inference (slides 1-15) Constrained decoding (slides 21-40) Subword models Concluding remarks

Language generation with recurrent GANs (ICML 2017 workshop paper)
Inducing regular grammars from RNNs> (IJCAI 2017 workshop paper)
Evaluating Domain Adversarial Neural Networks on Multi-Genre Natural Language Inference
Tackling Spuriousness with Similarity
Deep Text Style Transfer
Scene Graph Generation by Belief RNNs (NIPS 2018 paper)
Learning Text Segmentation using Deep LSTM (NAACL 2018 paper)
Tackling the Cornell NLVR task using seq2seq model with weak supervision (ACL 2018 paper)

Advanced Methods in Natural Language Processing – Spring 2019