Workshop On Search Data Structures for High-Performance Databases

Fall 2018-9

Lecturers: Adam Morrison, Moshik Hershcovitch (IBM Research)

When: Tue, 16:00-18:00

Room: Ornstein 110

Course Overview

Database management systems (DBMSs) rely on index data structures to locate data in database tables efficiently, without having to search every row in the table. Essentially, an index implements a map interface, which maps from keys (such as Andy) to the table row in which a pre-specified column (such as First name) matches the key. Indexes typically also support ordered iteration, which allows efficiently retrieving a range of rows. One popular index choice, for example, is the B-tree.

Traditionally, OLTP (online transactional processing) DBMSs stored both their data and its indexes on disk. However, the overhead of managing disk-resident data has lead to the development of in-memory OLTP DBMSs. Such DBMSs store the entire database in main memory, and thereby avoid the overhead and complexities of working with slow, block-based disk devices. In-memory DBMSs are expected to gain further popularity due to the emergence of persistent memory (PM) technologies (such as 3D XPoint). PM provides the persistency and large storage capacity of disks, but its programming interface and performance are similar to DRAM—low-latency, byte-addressable reads and writes. PM will thus enable in-memory DBMSs to scale beyond the dataset sizes that fit into main memory.

In this workshop, we will explore various indexes, both published in the research literature and novel designs developed at the university. We will especially focus on space-efficiency: indexes consume a significant fraction of the memory used by a DBMS engine (up to 60%), and reducing this space consumption is an important problem. Several projects will be offered, all around implementing and evaluating DBMS indexes. Programming will be done in C++.