next up previous
Next: Clustering Using BioClust Up: The BioCluct Clustering Algorithm Previous: Algorithm Correctness

Practical Implementation

Based on ideas of theoretical algorithm given in previous sections, a simple and practical heuristic was implemented. All the tests described in subsequent sections were performed using this practical implementation of the theoretic algorithm.

Let C be a cluster. Let Si,j be a similarity matrix and let $v \in V$ be a gene. We define the affinity of v to cluster C by $\frac{\sum_{u \in C} S_{u,v}}{\vert C\vert}$. Given affinity threshold $\tau$ we will say that v is a close gene to cluster C if its affinity to C is above $\tau$ and we will say that v is a weak gene in C if its affinity to C is below $\tau$.

Following are the steps of the practical implementation:

The main differences between the practical implementation and the theoretical algorithm are:

Although nothing can be proved about the running time and performance of the practical implementation, the test results described in the next sections show that it performs remarkably well both on simulated data and on real biological data.


next up previous
Next: Clustering Using BioClust Up: The BioCluct Clustering Algorithm Previous: Algorithm Correctness
Peer Itsik
2001-02-01