NLP: The IR Perspective - 0368.4341.01

Eytan Ruppin ( ruppin@math.tau.ac.il ) 
2st Semester, 2002 - Monday 14-17 
School of Computer Sciences, 
Tel-Aviv University

Text books:

Foundations of Statistical Natural Language Processing by Chris Manning and Hinrich Schutze, MIT Press, 1999. Statistical Language Learning by Eugene Charniak, MIT Press, 1993.

Course syllabus:

This course describes methods for information retrieval, relying mainly on statistical natural language processing. Topics covered include:
1. Basics of statistical NLP: morphology, syntax, semantics, language entropy, Markov chains, hidden Markov models, probabilistic context free grammars.
2. Text Retrieval: Vector space models, latent semantic indexing, HAL, semantic networks, link analysis.
3. Text Classification: Naive Bayes, Decision tress, neural networks, and K-nearest neighbors.
4. Test Clustering: Hierarchical algorithms, Non-hierarchical clustering, K-means, The EM algorithm.
5. Extras: Collocations, semantic disambiguation, text summarization.
6. IR on the Web: search engines, directories, knowledge management.
Last updated February, 2002