next up previous
Next: Least Squares Methods Up: Distance Matrix Methods Previous: Distance between Proteins -

     
Distance between DNA Sequences - Jukes-Cantor Model

According to the model of Jukes and Cantor [8] each base in the DNA sequence has an equal chance of mutating, and when it does, it is replaced by some other nucleotide uniformly. For a mutation probability of $3 \alpha \Delta t$ during each infinitesimally small period of time $\Delta t$, the chance of a nucleotide x remaining unchanged over a period of T time units is (recall exercise #1):

\begin{displaymath}P_{x \rightarrow x} = \frac{1}{4} (1 + 3 e^{-4 \alpha T})
\end{displaymath}

Given a branch in the tree, the probability that the site is different at the two edges is therefore:

\begin{displaymath}P_{u \neq v} = 1 - P_{x \rightarrow x} = \frac{3}{4} (1 - e^{-4 \alpha T})
\end{displaymath}

The Jukes-Cantor model defines an additive distance by using the difference per site to estimate $\alpha \Delta t$ itself. The values of $\alpha \Delta t$ on each branch will, by definition, add perfectly.

Itshack Pe`er
1999-02-18