Home

Martin Turjak

Project Admins:

Probability of Synapomorphy - R script

Measuring Syn om Phylogenetic Trees

An R script written by Martin Turjak, implementing a newly developed method for evaluating synapomorphy on phylogenetic trees. The method is described in detail in our manuscript A method for measuring support for synapomorphy using characterstate distributions on phylogenetic trees (accepted for publication in Cladistics) and all details will be made available here at the time of publication.

Introduction

In our manuscript we propose a method that quantifies the pattern of character state distribution over a cladogram and as such is free of any transformational model. We define the condition for a fully synapomorphic character state as a character state shared by all of the clade’s terminal taxa and at the same time completely absent from all terminal taxa outside that clade. The extent to which this condition is not violated serves to quantify the probability of synapomorphy. This probability is calculated as a combination of relative character state frequencies within and outside the clade based on the hierarchical structure of the cladogram. It corresponds to the probability of randomly selecting two terminals that share the same character state within the clade and at the same time differ from all outside terminals.

The Method - Documentation

Setup and use. - Download the R script from the link below. Set the working directory to the directory containing the script file and run the script file using folowing command in the R console:

     > source("Syn.R")

The script uses functions included in the ape package. So make sure ape is installed prior to using the Syn script. If everything loads right you should see a welcome message and you will be asked to enter the tree file:

     Please enter your tree file:

Enter the path (realtive or absolute) to your tree file (in newick format). In next steps we will use the sample files provided with the script as if they have been saved directly into the working directory. If the tree file loads successfully you will see this message:

     > tree_1.nwk
     Tree "tree_1.nwk" has been loaded successfuly.
     Please enter your data file:

Enter the path to your data file - see sample files for correct formating.

     > data_1.txt
     Data file "data_1.txt" has been loaded successfuly.

The results of the calculation for the first apomorphic character state will be plotted as numeric values on the nodes of the tree. The data and tree will stay loaded. If you want to plot another character state fom the same dataset you can use this command (Nu = integer index of character state):

     > Syn(Nu)

To display the whole list of character states and their indexes use:

     > ChStates
       Index ChState
     1     1       0
     2     2       1

If you want to know which data is currenty loaded you can use:

     > curData()
     Tree: tree_1.nwk
     Data: data_1.txt

The full documentation and method description will be made available shortly.

Download R Script and Example Files

Script: Syn.R (sourceforge.net)

The script is free to use under a GNU GPLv3 license.

The code can be viewed here: Code: Syn.R (sourceforge.net)

Example files:

Used in manuscript:
  • Niphargus tree: niph_tree.nwk (sourceforge.net) (see Fišer et al., 2008)
  • Niphargus data: niph_data.txt (sourceforge.net) (see Fišer et al., 2008)
\[character state 12 marked with "e" is the elegans-valachicus morphotype used in manuscript\]

\[the sarco_tree.nwk file contains 2 trees - uncomment the one you want to load\]

References

  • The Niphargus example:

    • Fišer, C., Sket, B., Trontelj, P. 2008. A phylogenetic perspective on 160 years of troubled taxonomy of Niphargus (Crustacea: Amphipoda). Zool. Scr. 37, 665-680.
  • The R Project:

    • R Development Core Team 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
  • Our manuscript:

    • Turjak, M., Trontelj, P. 2012. A method for measuring support for synapomorphy using characterstate distributions on phylogenetic trees. Cladistics, DOI: 10.1111/j.1096-0031.2012.00403.x

The script is free to use under a GNU GPLv3 license.

The code can be viewed here: Code: Syn.R (sourceforge.net)

Contact

martin.turjak@gmail.com


MongoDB Logo MongoDB