The BioCluct Clustering Algorithm

Next: Algorithm Idea Up: Clustering Gene Expression Data Previous: Probabilistic Approach

The BioCluct Clustering Algorithm

In this section we present an example of a clustering algorithm, BioClust. In subsequent sections we will see the results of applying BioClust to synthetic data and to real biological data. Material described here is from [1].

We will say that a clique graph H(V,E) has a $\gamma$ -cluster structure, if the size of each clique in H is at least $\gamma\vert V\vert$ .

For a fixed $0<\gamma<1$ we say that a randomized algorithm A reconstructs $\gamma$ -cluster structures w.h.p. if for each $0<\delta<1$ there exists n₀ such that for each $n\geq n_0$ and for any graph $G\in\Omega(H,p)$ , where H is clique graph with n vertices and $\gamma$ structure, $Prob(A(G)\neq H)<\delta$ .

First we describe a theoretic algorithm which reconstructs $\gamma$ -cluster structures w.h.p. The correctness proof of the algorithm will be outlined. We then present the practical heuristic, which is based on the theoretical algorithm and preserves most of its properties.

Peer Itsik
2001-02-01