Likelihood of a Tree

Next: Finding the Optimal Branch Up: Maximum Likelihood Previous: Maximum Likelihood

Likelihood of a Tree

For the analysis below, we shall use the following terms:

$\begin{definition}{\emph{Labels}, or \emph{states}, are the vectors of $m$ chara... ...etic distance, between the species associated with these nodes. \end{definition}$

As always, we assume that the characters are pairwise independent, and that the branching is a Markov process, that is, the probability of a node having a given label is a function only of the state of the parent node and the branch length, t, between them. Our model also includes a distance function to compute the latter probability, i.e.: $P_{x \rightarrow y}(t_{vu})$ , the probability that state x will transform into state y within the time t_vu. We further assume that the character frequencies are fixed throughout the evolutionary history, and that they are given as P(x).

$\begin{problem}Calculate the Likelihood of a Tree.\\ {\bf INPUT:} \begin{itemiz... ...alculate the likelihood $L$\space of the tree: $L = P(M\vert T)$ . \end{problem}$

First, let us deal with a simple case, where there is only one character identifying each species. Since the labels of the internal nodes are unknown, we need to sum over all possible reconstructions. For example, for the tree illustrated in figure 8.11, we can immediately write down the following formula:

$\begin{displaymath} L = P(M\vert T) = \sum_{r} \sum_{v} P(r) \cdot P_{r \rightar... ... P_{v \rightarrow u}(t_{vu}) \cdot P_{v \rightarrow w}(t_{vw}) \end{displaymath}$

(9)

where r and v are possible labels (character values) for the corresponding nodes.

**Figure 8.11:** A simple tree with branch lengths. The likelihood of this tree is calculated in equation 8.10.
$\includegraphics{lec08_figs/liketree.ps}$

To expand the formula for multiple characters, we simply need to repeat the above calculation for each character separately, and then multiply the results (recall the assumption that the characters are pairwise independent). The general equation is now:

L	=	$\displaystyle P(M\vert T) = \prod_{\rm {character} \ j} P(M_j\vert T)$
	=	$\displaystyle \prod_{\rm {character} \ j} \ \left\{\ \sum_{\rm {reconstruction} \ R} P(M_j,R\vert T)\ \right\}$
	=	$\displaystyle \prod_{\rm {character} \ j} \ \{\ \sum_{\rm {reconstruction} \ R}... ... \prod_{\rm {edge} \ u \rightarrow v} P_{u \rightarrow v}(t_{uv})\ \right] \ \}$	(10)

Note: The trees inferred by maximum likelihood appear from this description to be rooted trees. However, if the model of character substitution is reversible, i.e., $P(x)P_{x \rightarrow y}(t) = P(y)P_{y \rightarrow x}(t)$ , then the tree is actually unrooted - the root can be chosen arbitrarily, without any change is the likelihood of the tree.

It now remains to show how this calculation can be performed efficiently. The following dynamic-programming ``pruning'' algorithm was introduced by Felsenstein [3].

We can take this approach because of the tree likelihood properties in the Markov's model:

Additivity - $\P_{x \rightarrow y}(t+s)=sum_{y}P_{x \rightarrow y}(t)P_{y \rightarrow z}(s)$

Reversibility- $P(x)P_{x \rightarrow y}(t) = P(y)P_{y \rightarrow x}(t)$

Calculating the likelihood of a tree using Dynamic Programming:
For a character j, denote:

$\begin{displaymath}C_j(x,v) = P (\mbox{\ subtree whose root is\ }v \ \vert\ v_j = x \ )\end{displaymath}$

C_j(x,v) is the conditional likelihood of v's subtree, i.e., the probability of everything that is observed from node v on the tree down to the leaves, at character position j, given that v has the label x at this position.

Initialization:
For each leaf v and state x:

$\begin{displaymath}C_j(x,v) = \left\{ \begin{array}{ll} 1 & v_j = x \\ 0 & \rm {otherwise} \end{array}\right. \end{displaymath}$ (11)
Recursion:
Traverse the tree in postorder; for an internal node v with children u and w, compute for each possible state x:

$\begin{displaymath}C_j(x,v) = \left[ \ \sum_y C_j(y,u) \cdot P_{x \rightarrow y}... ...[ \ \sum_y C_j(y,w) \cdot P_{x \rightarrow y}(t_{vw})\ \right] \end{displaymath}$ (12)
The final solution is:

$\begin{displaymath}L = \prod_{j=1}^{m} \ \left[ \ \sum_x C_j(x,root) \cdot P(x)\ \right] \end{displaymath}$ (13)

Complexity: For n species, m characters, and k possible states for each character, we perform $O(m \cdot k^2)$ work in O(n) nodes, so the running time of the algorithm is $O(n \cdot m \cdot k^2)$ .

Next: Finding the Optimal Branch Up: Maximum Likelihood Previous: Maximum Likelihood

Peer Itsik
2001-01-01