next up previous
Next: Distance between DNA Sequences Up: Distance Matrix Methods Previous: Pairwise Distances

     
Distance between Proteins - PAM matrices

We have already defined the PAM matrices, when we discussed heuristics for sequence alignment (in lecture #3). The PAMn matrix is designed to compare two amino-acid sequences which are n PAM units apart. Its calculation involves raising M, the mutation probabilities matrix for one PAM unit, to the power of n. For a continuous distance function, we need to define PAM matrices for non- integer units, as well. Let $M=U^{-1} \lambda U$ be the diagonalization of M, where $\lambda$ is a diagonal matrix, made up of M's eigen-values, and U is an orthonormal matrix, which consists of the corresponding eigen-vectors. Given a real x, the PAMx distance matrix is simply:

\begin{displaymath}PAM_x(i,j)=\log{\frac{M^x(i,j)}{f(i)}}
\end{displaymath}

where f(i) is the frequency of the i-th amino-acid, and $M^x=U^{-1}
\lambda ^xU$.

Itshack Pe`er
1999-02-18