Spliced alignment algorithm

Next: Complexity Up: Spliced Alignment Previous: Spliced alignment problem

Spliced alignment algorithm

To solve the problem, we use dynamic programming. We start with a few notations: Let $B_1\ast B_2$ denote the concatenation of B₁ and B₂. A j-prefix of $A=a_i\ldots a_j\ldots a_l$ is $A(j)=a_i\ldots a_j$ . For block $B=g_i\ldots g_j$ First(B) = i and ${\rm Last}(B) = j$ . A Chain $\Gamma ^{*} = B_1\ast\ldots\ast B_k$ ends at ${\rm Last}(B_k)$ , and it ends before position i, if ${\rm Last}(B_k) < i$ . In addition, we define $P(i,j) =\max \limits _{{\rm all~chains}~\Gamma \atop {\rm ending~before~}i} S(\Gamma ^{*},T(j))$ . According to these definitions, the optimal spliced alignment score is P(n+1,m). When First $(B_k)\leq i \leq {\rm Last}(B_k)$ we can define the i-prefix of $\Gamma = B_1\ast\cdots\ast B_k$ by the following expression: $\Gamma ^{*}(i) = B_1\ast \cdots\ast B_{k-1}\ast B_k(i)$ . If First $(B_k)\leq i \leq {\rm Last}(B_k)$ we define ${\em BL(i,j,l)}=\max \limits _{{\rm all~chains~} \Gamma \atop {\rm ending~at~}l}S(\Gamma ^{*}(i),T(j))$ . Now, we can calculate P(i,j) :
$P(1,j) = j\cdot \Delta (-,t_j) {\rm\ where}\ \Delta (x,y)\ {\rm is~the~score~of~aligning~}x{\rm ~and~}y$ .
$P(i,j)=\max \cases {P(i-1,j) & \cr BL(i-1,j,i-1) & if there exist chains ending at $i-1$ }$ BL(i,j,l) satisfies the following recurrence relation:
$BL(i,j,l)=\max \cases{ BL(i-1,j-1,l)+ \Delta (G_i,t_j) & if $j\geq 1,\exists ... ...(B)=i\leq $ Last$(B)=l$ \cr BL(i,j-1,l)+\Delta (-,t_j) & if $j\geq 1$ }$

Itshack Pe`er
1999-02-03