| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| readme.RHMM_snp_allocate | 2018-04-06 | 4.1 kB | |
| RecHMM_v1.01.R | 2014-11-11 | 18.6 kB | |
| RecHMM.R | 2014-10-30 | 18.6 kB | |
| RHMM_snp_allocate_old.R | 2014-10-24 | 7.1 kB | |
| RHMM_snp_allocate.R | 2014-10-24 | 7.5 kB | |
| RecHMM.documents.pdf | 2014-04-04 | 101.3 kB | |
| Totals: 6 Items | 157.1 kB | 0 | |
*** IMPORTANT ***
A newer and more powerful version of RecHMM is now available as a module in EToKi package.
Please find it in the link: https://github.com/zheminzhou/EToKi
-----------
RHMM_snp_allocate.R is a progam for assigning SNPs onto branches in a known phylogeny, using Viterbi algorithm.
USAGE:
Rscript RHMM_snp_allcate.R <file.newick> <file.SNP> <proportion of sites for the phylogeny> <file.REC> <rec.diversity>
<file.newick> The known phylogeny in NEWICK format
<file.SNP> A list of all SNPs in the analysed genomes
<proportion of sites for the phylogeny> Most of bioinformaticians use only polymophism sites to build the phylogeny. This will megnified the branch lengthes by ignoring all non-polymorphic sites.
For example, if you used 10,000 polymorphic sites in a core genome of 1,000,000 bps to build the tree, this parameter will be 10000/1000000 = 0.01
Following parameters are not required:
<file.REC> Recombinations can alter the local branch lengths. You can take this into account by adding a file containing recombinant regions.
<rec.diversity> The average nucleotide diversity in recombinant regions. You can get this value as 'nu' in RecHMM output.
------------------------------
INPUTS:
<file.newick>
One simplest example:
((1:0.01,2:0.01):0.01,3:0.02);
<file.SNP>
Format:
#Site <genome 1> <genome 2> ...
<site coordinate> <base> <base> ...
...
Example:
#Site 1 2 3
3 A T T
10 A A G
103 C G G
...
Note: The names of genomes in the first line have to be the same as the tips in the file <file.newick>. All columns are separated by 'tab' ('\t').
<file.REC>
Format:
<start_site> <end_site> <not used> <not used> <branch name>
Example:
2577 2609 0.688236769030142 33 Br_1
7971 7990 0.916113340873472 20 Br_1
16493 16516 0.872056771815245 24 Br_2
19101 19152 0.997786721510221 52 Br_2
21976 22059 0.975300372805246 84 Br_2
23790 23987 0.964951921855543 87 Br_2
29670 29726 0.997294595078072 54 Br_4
30157 30250 0.999985008675643 94 Br_4
This format is compatible with the outputs of RecHMM.
--------------------------
OUTPUTS:
two files are generated:
<file.newick>.annote.nex The same tree as the input, with branches designated with a serial number brID="Br_xxxx". This file can be opened in FigTree.
<file.newick>.events The assignments of SNPs onto branches.
Format for <file.newick>.events:
<Site of SNP> <brID> <bipartition information> <nucleotide change>
Note:
To change this into the inputs for RecHMM, simply extract the first two columns. Homoplasies are shown in multiple rows with the same site.
Example:
48 Br_108 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAABBBBBBBBB T->A
48 Br_110 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB T->G
62 Br_15 BBBBBBBBBBBBBBBBBBBBBBBABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB G->A
62 Br_91 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAABAAAAAAA G->A
62 Br_92 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAABBBAAAAAA A->G
62 Br_108 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAABBBBBBBBB G->C
62 Br_109 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAABAAAAAAAAABBBBBBBBBBBBABABBBBBB A->G
62 Br_122 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB G->A
62 Br_127 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBABABABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB A->G
62 Br_128 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBABABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB G->A
62 Br_52 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB G->A
66 Br_15 BBBBBBBBBBBBBBBBBBBBBBBABBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB G->A
66 Br_91 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAABAAAAAAA G->A
66 Br_92 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAAAAAAAAAAAABBBAAAAAA A->G
66 Br_94 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBAABBBBBBBBBBB G->T