Tel-Aviv University - Computer Science Colloquium

Sunday, Dec 18, 2005, 11:15-12:15

Room 309
Schreiber Building


Robert Krauthgamer

IBM Research


Unstructured data: practical algorithms with rigorous analysis



The plethora of data available nowadays has tremendous potential, but

analyzing it presents a host of algorithmic challenges.  Current data

sets are mostly unstructured and noisy, requiring relatively

complicated computational tasks, such as sequence alignment and

similarity search.  Furthermore, the data's enormous size might

severely restrict the computational paradigm, e.g., to streaming or

sublinear-time algorithms.


I will present some recent rigorous algorithmic approaches aimed at

explaining and/or predicting success in practice.  In particular, for

problems that are hard in the worst-case, one may design a plausible

model for real-life data, and then exploit this ``additional

structure'' to devise improved algorithms.  My running example will be

near neighbor searching in regimes such as the Euclidean distance and

the edit distance.