Efficient Program with Relations

One of the first things a programmer must commit to in developing any significant piece of software is the representation of the data. In applications where performance or memory consumption is important, this representation is often quite complex: the data may be indexed in multiple ways and use a variety of concrete, interlinked data structures. The current situation, in which programmers either directly write these data structures themselves or use a standard data structure library, leads to two problems:

  1. The particular choice of data representation is based on an expectation of what the most common workloads will be; that is, the programmer has already made cost-benefit trade-offs based on the expected distribution of operations the program will perform on these data structures.
  2. It is difficult for the programmer to check or even express the high-level consistency properties of complex structures, especially when these structures are shared. This also makes software verification in existing programming languages very hard.

In this project we investigate specification languages for describing and reasoning program data at a much higher level. The hope is that this can reduce the inherited complexity of reasoning about programs. In tandem, we will check if the high level specifications can be semi-automatically mapped specifications to efficient data representations. Our approach allows the user to define global invariants and a restricted set of high level operations, and only then to synthesize a representation that both adheres to the invariants and is highly specialized to exactly the set of operations the user requires.

More material can be found in An Introduction to Data Representation Synthesis.