next up previous
Next: Multiple Alignment with Profile Up: Profile Alignment Previous: Aligning Sequences to a

Forward and Backward Probabilities for a Profile HMM

In the previous section we modeled the problem of aligning a string to a profile. As with general HMMs, the main problem is to assign meaningful values to the transition and emission probabilities to a profile HMM. It is possible to use the Baum-Welch algorithm for training the model probabilities, but it first has to be shown how to compute the forward and backward probabilities needed for the algorithm.

Given a string $X=(x_{1},\ldots,x_{m})$ we define:

Computing the Forward Probabilities:

1.
Initialization:

fbegin(0) = 1 (55)

2.
Recursion:

\begin{displaymath}\begin{split}
f_{j}^{M}(i) = e_{M_{j}}(x_{i}) \, \cdot \,
[...
...}+ \\
&f^{D}_{j-1}(i-1)\cdot a_{D_{j-1},M_{j}}]
\end{split} \end{displaymath} (56)


\begin{displaymath}\begin{split}
f^{I}_{j}(i) = e_{I_{j}}(x_{i}) \, \cdot \,
[...
...I_{j}}+\\
&f^{D}_{j}(i-1)\cdot a_{D_{j},I_{j}}]
\end{split} \end{displaymath} (57)


\begin{displaymath}\begin{split}
f^{D}_{j}(i) = \; &f^{M}_{j-1}(i)\cdot a_{M_{j...
..._{j}}+\\
&f^{D}_{j-1}(i)\cdot a_{D_{j-1},D_{j}}
\end{split} \end{displaymath} (58)

Computing the Backward Probabilities:

1.
Initialization:

bML(m) = aML,end (59)


bIL(m) = aIL,end (60)


bDL(m) = aDL,end (61)

2.
Recursion:

\begin{displaymath}\begin{split}
b_{j}^{M}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{M_...
...+1})+ \\
&b^{D}_{j+1}(i)\cdot a_{M_{j},D_{j+1}}
\end{split} \end{displaymath} (62)


\begin{displaymath}\begin{split}
b^{I}_{j}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{I_...
...i+1})+\\
&b^{D}_{j+1}(i)\cdot a_{I_{j},D_{j+1}}
\end{split} \end{displaymath} (63)


\begin{displaymath}\begin{split}
b^{D}_{j}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{D_...
...i+1})+\\
&b^{D}_{j+1}(i)\cdot a_{D_{j},D_{j+1}}
\end{split} \end{displaymath} (64)





The forward and backward variables can then be combined to re-estimate emission and transition probability parameters as follows:

Baum-Welch re-estimation equations fo profile HMMs:

1.
Expected emission counts from sequence X:


\begin{displaymath}\begin{split}
E_{M_{k}}(a)=\frac{1}{P(X)}\sum_{i\vert x_{i}=a}{f_{k}^{M}(i)b_{k}^{M}(i)}
\end{split} \end{displaymath} (65)


\begin{displaymath}\begin{split}
E_{I_{k}}(a)=\frac{1}{P(X)}\sum_{i\vert x_{i}=a}{f_{k}^{I}(i)b_{k}^{I}(i)}
\end{split} \end{displaymath} (66)

2.
Expected transition counts from sequence x:


\begin{displaymath}\begin{split}
A_{X_{k}M_{k+1}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{...
..._{k}M_{k+1}}e_{M_{k+1}}(x_{i+1})b^{M}_{k+1}(i+1)}
\end{split} \end{displaymath} (67)


\begin{displaymath}\begin{split}
A_{X_{k}I_{k}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{k}(i)a_{X_{k}I_{k}}e_{I_{k}}(x_{i+1})b^{I}_{k}(i+1)}
\end{split} \end{displaymath} (68)


\begin{displaymath}\begin{split}
A_{X_{k}D_{k+1}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{k}(i)a_{X_{k}D_{k+1}}b^{D}_{k+1}(i)}
\end{split} \end{displaymath} (69)


next up previous
Next: Multiple Alignment with Profile Up: Profile Alignment Previous: Aligning Sequences to a
Peer Itsik
2000-12-19