next up previous
Next: Weighted Parsimony Up: Parsimony Previous: Parsimony

   
Small Parsimony


 \begin{problem}
Small Parsimony.\\
{\bf INPUT:} The topology of a rooted phylog...
...What is the optimal labeling of the internal nodes?
\end{enumerate}\end{problem}

This problem is relatively easy to solve. First of all, it is clear that we can solve it for each character separately, characters being mutually independent. For a single character, we will present the following algorithm:

Fitch's algorithm [6]:

Input: A phylogenetic tree T, with n nodes, and a single character c with a set A of k possible values. Denote the value of the character for node v by vc.

Step 1: We will assign to each node v a set $S_v \subseteq A$, in the following fashion:

\begin{displaymath}\begin{array}{ll}
\mbox{For each leaf } v: & S_v = \{v_c\}.\\...
...u \cup S_w & \ & \rm {otherwise}
\end{array}\right.
\end{array}\end{displaymath}

To compute Sv we will of course have to traverse the tree in postorder - starting with the leaves and working our way down to the root (this is actually a dynamic programming algorithm).

Step 2: Given the sets Sv, we will now determine the value vc to assign to the character c in each internal node v. This time, we traverse the tree in preorder, i.e., from the root up. For each internal node v, if its parent u satisfies $u_c \in S_v$, set $v_c \leftarrow u_c$; Otherwise, (including for the root node), arbitrarily assign any $t \in S_v$ to vc.

The result of this algorithm is a fully-labeled tree. The number of changes in this tree is equal to the number of times $S_u \cap S_w$ was empty, in step 1.

Complexity: For each node v we work O(k) time to compute Sv, and again O(k) to compute vc. Total - $O(n \cdot k)$ time (step 2 can be performed in only O(n) total time in the average case).

The above algorithm works with a single character. To obtain the optimal score and labeling for the entire data, simply apply the algorithm once for each character. This leads to an overall complexity of $O(m \cdot n \cdot k)$.


\begin{example}% latex2html id marker 142
In figure \ref{lec08:Fig:Fitch} we hav...
...s empty, which means that the minimum total cost of the tree is 3.
\end{example}


  
Figure 8.4: An example of step 1 of Fitch's algorithm for a 5-species phylogeny. Nodes marked by an asterisk (*) require a change along one of the edges to their children, adding 1 to the parsimony score.
\includegraphics{lec08_figs/fitch.ps}

It is not very clear at first sight why this algorithm works. We will next present a generalization of the Fitch algorithm, that is perhaps easier to understand.


next up previous
Next: Weighted Parsimony Up: Parsimony Previous: Parsimony
Peer Itsik
2001-01-01