next up previous
Next: Character Based Methods Up: Preface: Phylogenetics and Phylogenetic Previous: What is Phylogenetics?

Phylogenetic Trees

The most convenient way of presenting phylogenetic information is using a phylogenetic tree. In a phylogenetic tree, every node represents a species. Nodes are labeled, either with species names or the values (also referred to as states) of their characters, and the edges represent the genetic connections. It is important to note that there is usually a big difference between the leaf nodes, that represent real species, and the internal nodes, that in most cases represent the hypothetical evolutionary ancestors of the species in the data.

Phylogenetic trees take several forms: They can be rooted or unrooted, binary or general, and may show, or not show, edge lengths. A rooted tree is a tree in which one of the nodes is stipulated to be the root, and thus the direction of ancestral relationships is determined. An unrooted tree, as could be imagined, has no pre-determined root and therefore induces no hierarchy. Therefore, in this case, the distance between the nodes should be symmetric (since the tree edges are not directed). Rooting an unrooted tree involves inserting a new node, which will function as the root node. This can be done by introducing an outgroup, a species that is definitely distant from all the species of interest. The proposed root will be the direct predecessor of the outgroup. Figures 8.1 and 8.2 show a rooted tree and its unrooted counterpart, respectively.

A binary, or bifurcating, tree is of course a tree in which a node may have only 0 to 2 subnodes, that is, in an unrooted tree, up to three neighbors. It is sometimes useful to allow more than 2 subnodes (multifurcation), but the discussion in this lecture will be limited to binary trees.

 A tree can show edge lengths, indicating the genetic distance between the connected nodes. We sometimes assume the existence of a molecular clock, a constant pace of the evolutionary processes. If this is the case, we could theoretically produce a phylogenetic distance-preserving tree which can be presented along a time-axis - assigning to each node the time in which it ``occurred'' in the history of evolution. In such a ``perfect'' tree, the length of each edge would be the difference in time between the parent node and the child node.

There are two types of data used for building phylogenetic trees:

Distance-based: The input is a matrix of distances between the species (e.g., the alignment score between them or the fraction of residues they agree on).
Character-based: Examine each character (e.g., a base in a specific position in the DNA) separately.

next up previous
Next: Character Based Methods Up: Preface: Phylogenetics and Phylogenetic Previous: What is Phylogenetics?
Peer Itsik