Shape Analysis of Trees
I. Overview:
This directory implements operations on binary trees and on
binary-search trees (sorted binary trees).
The analysis reports an error when the data structure is no longer a
tree (or when the data ordering is violated). This is achieved via
data-structure invariants, which are encoded as TVLA messages. For
more information, search for messages with DSI in comments in action
Set_Sel_T of stat_tree.tvp and stat_sort.tvp. Some of the
restrictions may be relaxed in order to allow shared trees.
II. Programs, inputs, and running the analyses:
Programs to be analyzed are encoded in .tvp files. C code for some of
the programs is contained in corresponding .c files.
This directory contains a degenerate test, non_tree.tvp, binary-tree
traversal algorithms DSW.tvp and LDS.tvp, and binary-search-tree
routines InsertSorted(2|3).tvp and DeleteSorted(2).tvp. LDS.tvp is
the Lindstrom tree traversal algorithm, which is similar to DSW
(Deutsch-Schorr-Waite), but is specialized for binary trees and is
constant-space (it uses no child counters/markers). The C code for
these two algorithms is in DSW.c and LDS.c. InsertSorted(2|3).tvp and
DeleteSorted(2).tvp are routines that perform an insertion and a
deletion, respectively, of an element into (from) a binary-search
tree.
.tvs files contain test inputs to the programs with corresponding
names. It is recommended that analysis be performed with the
properties contained in tree.properties. In particular, finite
differencing for identity formulas (controlled by property
tvla.differencing.differenceIdForms) has to be turned on. Script run
can be invoked to perform all analyses.
Example tvla command:
tvla InsertSorted -props tree.properties
III. Legend for analysis specification files:
Program specifications (.tvp files) include predicate and action
definitions, which are program-independent. Here's a brief file
legend:
1. File types:
pred_xxx.tvp Predicate definitions
cond_xxx.tvp Actions for handling program conditions
stat_xxx.tvp Actions for handling other program statements
2. File Suffixes:
File groups with the following suffixes (e.g. "tree" below refers to
pred_tree.tvp, cond_tree.tvp and stat_tree.tvp) are used to encode the
operations in the right column:
tree Binary tree operations
sort Binary-search tree operations
(require data-fields in tree to be in sorted order)
set Operations on (unbounded) sets
dfs Node marking operations for DFS traversals
Subdirectory shape contains pred/cond/stat files for basic pointer and
boolean variable operations. These are included in files with the
tree and sort suffixes.
3. #define VARSEL, DeleteSorted/DeleteSorted2.tvp, and
InsertSorted/InsertSorted2.tvp:
If macro (#define) VARSEL is defined, as it is in DeleteSorted.tvp and
InsertSorted.tvp, then pred_sort.tvp additionally defines a family of
instrumentation predicates, one for each combination of variable x and
field sel (i.e. left or right):
varSel[x,sel](v) = E(v1) (x(v1) & sel(v1, v)).
This family of predicates is needed to verify that InsertSorted is
partially correct, when the analysis is in partial join mode (property
tvla.joinType=part). Without these predicates, the analysis obtains
structures, in which some tree nodes are unreachable from any variable
(which cannot happen). Interestingly, such structures do not arise
when the analysis is in relational join mode (property
tvla.joinType=rel). InsertSorted2.tvp is the same as
InsertSorted.tvp, except it includes pred_sort.tvp (which does NOT
contain the varSel family of predicates). You will get the
over-approximated results using partial join with InsertSorted2 and
the expected results using relational join. (BTW, InsertSorted in
partial join mode and InsertSorted2 in relational join mode both take
~55 seconds, while (the over-approximating) InsertSorted2 in partial
join mode takes ~6 seconds on a 4Ghz machine.)
The same family of predicates is also needed to verify that
DeleteSorted maintains sortedness of a binary-search tree. However,
this results in a very expensive analysis (~2 hours on a 4Ghz machine
using partial join). In the case of DeleteSorted2, which is a minor
modification of DeleteSorted, the verification succeeds without the
additional predicates (and takes ~8 minutes using partial join).
4. InsertSorted3.tvp, CreateSorted3.tvp, CreateSorted3_freeSet.tvp
InsertSorted3.tvp is an easier to analyze version of InsertSorted.
The insertion happens right after the correct place to insert is
identified, so that the confusion of InsertSorted that required varSel
preds does not arise. This can be analyzed precisely and fast using
partial join.
One difference is that InsertSorted and InsertSorted2 do NOT insert
duplicate elements, while InsertSorted3 does.
CreateSorted3.tvp is basically a loop around InsertSorted3.tvp that
creates a binary-search tree from scratch.
CreateSorted3_freeSet.tvp is the same as CreateSorted3.tvp but obtains
new elements from a free set (that's part of the input), rather than
from malloc. This method is more precise and is recommended.
5. CreateUnsorted3_freeSet.tvp, Create3_freeSet.tvp
CreateUnsorted3_freeSet.tvp is just like CreateSorted3_freeSet.tvp (which
includes InsertSorted3 code) but it makes a non-deterministic choice when
choosing whether to insert into the left or right subtree.
Create3_freeSet.tvp is the same thing but it does not include any ordering
information, so it can be used to construct the abstraction of all trees,
when data ordering is not part of the abstraction.