next up previous
Next: Distance between DNA Sequences Up: Distance Based Methods Previous: Pairwise Distances

   
Distance between Proteins - PAM matrices

We have already defined the PAMmatrices, when we discussed heuristics for sequence alignment (in lecture #3). The PAMn matrix is designed to compare two amino-acid sequences which are n PAM units apart. Its calculation involves raising M, the mutation probabilities matrix for one PAM unit, to the power of n. For a continuous distance function, we need to define PAM matrices for non- integer units, as well.

Let $M=U^{-1} \lambda U$ be the diagonalization of M, where $\lambda$ is a diagonal matrix, Whose entries are M's eigen-values, and U is an orthonormal matrix, which consists of the corresponding eigen-vectors. Given a real x, the PAMx distance matrix is simply:


\begin{displaymath}PAM_x(i,j)=\log{\frac{M^x(i,j)}{f(i)}}
\end{displaymath}

where f(i) is the frequency of the i-th amino-acid, and $M^x=U^{-1} \lambda ^xU$.



Peer Itsik
2001-01-01