Name | Modified | Size | Downloads / Week |
---|---|---|---|
GenNonH_MAy2013 | 2013-06-08 | ||
GenNonH_share.zip | 2013-06-08 | 5.7 MB | |
README | 2012-07-26 | 4.2 kB | |
Totals: 3 Items | 5.7 MB | 0 |
This is a readme for GenNon-h package for discrete-time data generation. http://genome.crg.es/cgi-bin/phylo_mod_sel/AlgGenNonH.pl ---------------------------------------------------- This is a free software and it can be redistributed, modified or else as given by the terms of the GNU General Public License. Dependencies ---------------------------------------------------- We use version 1.47.0 of the boost library (http://www.boost.org/) for some code involving maths and random generation (downloaded from http://sourceforge.net/projects/boost/files/boost/1.47.0/). // Use, modification and distribution are subject to the Boost Software License, // Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at // http://www.boost.org/LICENSE_1_0.txt). Overview ---------------------------------------------------- GenNon-h: simulates a MSA on a given tree with assigned branches under the discrete-time models To compile this code you need a development environment with the GNU gcc compiler. GenNon-h is compiled with the command 'make'. Source Code ---------------------------------------------------- include A piece of the boost library. alignment.cpp Managing alignments and vectors of counts. em.cpp The EM algorithm, KL-divergence, likelihoods ... GenNon-h.cpp The main function. matrix.cpp Matrix creation and destruction. miscelania.cpp, funs.cpp Utility functions model.cpp Keeps the model dependent functions in a single place. model_gmm.cpp The GMM functions. model_jc.cpp The JC69 functions. model_k80.cpp The K80 functions. model_k81.cpp The K81 functions. model_ssm.cpp The Strand-Symmetric functions. Newickform.cpp Used for reading the tree (adopted from the code of Yu-Wei Wu). parameters.cpp Data structure for the parameters random.cpp The random generation functions, for use parameter sampling and alignment simulation. permutation.cpp To create the DLC matrices read_fasta.cpp Reading the fasta files. sampling.cpp Functions used for randomly sampling parameters. seqUtil.cpp Used for reading the tree (adopted from the code of Yu-Wei Wu). state.cpp Data structure for the states. state_list.cpp Data structure for a list of all states in a tree (on the leaves and the hidden) tree.cpp Data structure for trees (as a list of edges). GenNon-h ---------------------------------------------------- Command: GenNon-h <tree file> <output file> <length> <model> Simulates fasta alignments with random parameters for a given model and tree topology. The lengths in <tree file> are used . The parameters used for the simulations are saved in a file with the same name as the fasta and suffix ".dat" <tree file> tree in a Newick format <fasta file> Output file for the simulated alignment. WARNING ! overwrites the existing files <length> Length of the alignment <model> The model: jc, k80, k81, ssm, gmm Sample commands: ./GenNon-h test2.tree data.fa 10000 k81 ./GenNon-h star.tree data.fa 5000 jc ./GenNon-h test2.tree data.fa 10000 k81 Tree format example: test2.tree: ((human:0.01,ape:0.2,hamster:0.3):0.5,bird:0.4,amoeba:0.7) Output to the screen: Model: Kimura 81 Tree: nodes: 7 nleaves: 5 nedges: 6 Edges: (5, 6) 0.5 (5, 3) 0.4 (5, 4) 0.7 (6, 0) 0.01 (6, 1) 0.2 (6, 2) 0.3 Labels of the leaves: human ape hamster bird amoeba (node labels starting from 0) The nodes are labeled in the following order: first the leaves, proceeded by the top-down search starting from the node labeled as the root (the node of highest depth). Left to right order of the nodes is indicated in the newick format. Output files: Fasta MSA name-of_the_fasta_file.fa Parameters used for the simulations name-of_the_fasta_file.dat .dat file details: Line 1: # of leaves, # of edges Line 2: equilibrium frequencies of a node chosen as the root Note: The order in which the output matrices are listed is in accordance with the order outlined above (the matrices assigned to the leaf edges followed by the top-down listing).