next up previous
Next: Simulation Results Up: cDNA Clustering Previous: Cluster Merging

   
Assessing Clustering Quality

A measure for the quality of a solution given the true clustering should be devised. Let T be the ``true'' solution and S the solution we wish to measure. Denote by n11 the number of pairs of elements that are in the same cluster in both S and T. Denote by n01 the number of pairs that are in the same cluster only in S, and by n10 the number of pairs that are in the same cluster only in T. We define the Minkowski Score to be:

\begin{displaymath}D_M(T,S)=\sqrt{\frac{n_{01}+n_{10}}{n_{11}+n_{10}}}
\end{displaymath}

In this case the optimum score is 0, with lower scores being ``better''. An alternative is the Jaccard Score:

\begin{displaymath}D_J(T,S)=\frac{n_{11}}{n_{11}+n_{10}+n_{01}}
\end{displaymath}

Here the optimum score is 1, with greater scores being ``better''.

Peer Itsik
2001-01-31