Stemmatology

Data formats for trees

There are a number of data formats for representing phylogenetic or other kinds of trees. Perhaps the most common one, which shall be discussed here exemplarily, is Newick. It uses brackets in quite an intuitive way. Representation on various levels of detail are possible. For example, the two strings

    (A, (B, (C, D)));
    (A:1.0,(B:0.6,(C:0.2,D:0.1):0.6):0.2);

encode the same bifurcating tree topology with four labelled leaf nodes, respectively with and without edge lengths. The resulting trees are shown in Fig. 1.

Illustration

 

Fig. 1: Left: A tree without edge lengths. Right: A tree with edges drawn proportional to their lengths. The edge lengths are also shown as numbers. (The figures were drawn using the program FigTree.)

 

If the tree is unrooted, an arbitrary node is chosen to be used first. Any node may be specified as root in the tree in this format. The format uses the file extension .nwk. It is used for instance in Phylip, MrBayes and PAUP*.

 

TR, PR

 

Attachments:

exABCD1.png (image/png)
exABCD2.png (image/png)