next up previous
Next: The Construction Algorithm Up: Constructing Physical Maps from Previous: Problem Statement

Clone Pair Overlap Score

Let Ca and Cb be two clones viewed as intervals of the same length l. Define $C_{\gamma}$ = $C_{a} \bigcap C_{b}$ and $l_{\gamma} = \vert C_{\gamma} \vert$. The relative position of Ca,Cb and $C_{\gamma}$ is shown in figure 9.10. The overlap score uses the hybridization vectors $\overrightarrow{B_{a}}, \overrightarrow{B_{b}}$ to produce a vector probabilities for each length $l_{\gamma}$ of the overlap.
  
Figure 9.10: Clone pair overlap score
\includegraphics{lec09_fig/clone_overlapping.eps}

We first calculate the probability $Pr(\overrightarrow{B_{a}},
\overrightarrow{B_{b}} \vert l_{\gamma} = t)$. Let , $C_{v} = C_{b} \backslash C_{a}$, and recall that Ai,j is the number of occurrences of probe j in Ci. We can thus write the following equation:

\begin{eqnarray*}Pr(B_{a,j},B_{b,j} \vert l_{\gamma} = t) &= &
\sum_{K_{u}}\su...
...
& & \cdot Pr(A_{\gamma,j} = K_{\gamma}\vert l_{\gamma} = t)
\end{eqnarray*}


The calculation of the probabilities inside the summation is straightforward using the statistical model. Since hybridization is a virtual certainty if a probe occurs many times inside a clone, we can limit the summation to small values of Ki (say $0
\leq K_{i} \leq 5$), thereby making the score computation feasible while introducing only a negligible error. Considering each probe as an independent source of information, the conditional probability of the vector pair $(\overrightarrow{B_{a}},
\overrightarrow{B_{b}})$ is:

\begin{displaymath}Pr(\overrightarrow{B_{a}},\overrightarrow{B_{b}} \vert l_{\ga...
...t)
= \prod_{j=1}^{n} Pr(B_{aj},B_{bj} \vert l_{\gamma} = t)
\end{displaymath} (8)

Assuming uniform parameters for the probes, the expression $Pr(B_{a,j},B_{b,j} \vert l_{\gamma} = t)$ inside the product is independent of j. Therefore, we can define Px,y[t] by Px,y[t] = Pr(Ba,j = x, Bb,j = y | t). In practice, instead of computing Px,y[t] for each t in the interavl [0,l], we use score quantization of this interval, and perform the computation only for representative values of t. Denoting by Sx,y(a,b) the set of probes $1 \leq j \leq n$, such that Ba,j = x and Bb,j = y, we can write:

\begin{displaymath}Pr(\overrightarrow{B_{a}},\overrightarrow{B_{b}} \vert t)
=...
..._{x=0}^{1}\prod_{y=0}^{1}P_{x,y}[t]^{\vert S_{x,y}(a,b)\vert}
\end{displaymath} (9)

Having computed $Pr(\overrightarrow{B_{a}},\overrightarrow{B_{b}}
\vert t)$ we can use Bayes Theorem:

\begin{displaymath}Pr(l_{\gamma} = t_{0} \vert
\overrightarrow{B_{a}},\overrig...
...errightarrow{B_{b}} \vert l_{\gamma} = t)Pr(l_{\gamma}
=t)}
\end{displaymath} (10)


next up previous
Next: The Construction Algorithm Up: Constructing Physical Maps from Previous: Problem Statement
Peer Itsik
2001-01-09