MimicrEE2 allows to simulate de novo mutations. Mutations are solely introduced at sites specified in the base population. Thus MimicrEE2 will not generate novel sites!
To illustrate simulations with mutations we introduce novel mutations into a monomorphic base population.
In this example we perform simulations with MimicrEE2 for 100 generations and use a mutation rate of 0.001. We monitor the evolution of the haplotypes.
We store the following base population in the file monomorphic.mimhap
2L 1 G A/C AA AA AA AA AA AA AA AA AA AA 2L 2 G A/T AA AA AA AA AA AA AA AA AA AA 2L 3 G A/G AA AA AA AA AA AA AA AA AA AA 2L 4 G A/C AA AA AA AA AA AA AA AA AA AA 2L 5 G A/T AA AA AA AA AA AA AA AA AA AA 2L 6 G A/G AA AA AA AA AA AA AA AA AA AA 2L 7 G A/C AA AA AA AA AA AA AA AA AA AA 2L 8 G A/T AA AA AA AA AA AA AA AA AA AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA
Note all sites are monomorphic for the allele A (ancestral). Mutations will than generate the derived allele (either C, T or G).
Note mutations will be introduced in the chromosome 2L at the sites 1 to 10. NO mutations will ever be introduced at unspecified sites such as positions 11, 12...
Note mutation of a derived allele results in the ancestral allele; mutations are thus merely flipping states ancestral->derived and derived->ancestral
In this example we perform neutral forward simulations for 100 generations. The haplotypes of the evolved populations will be stored at generations 10,20,30,40,50 and 100. We use a mutation rate of 0.001, with 10 sites and 20 haplotypes we expect 0.2 novel mutations in the population at each generation (or 2 mutations each 10 generations).
mkdir output java -jar mim2.jar w --haplotypes-g0 monomorphic.mimhap --mutation-rate 0.001 --output-dir output --snapshots 10,20,30,40,50,100
haplotypes after 10 generations
After 10 generations, on the average 2 novel mutations are expected.
The haplotypes may be displayed with the command gzip -cd output/haplotypes.r1.g10.mimhap.gz.
We observe an A->C mutation at site 1 and a A->T mutation at site 5.
2L 1 G A/C AA AA AA AA AA AA AA AA AA CC 2L 2 G A/T AA AA AA AA AA AA AA AA AA AA 2L 3 G A/G AA AA AA AA AA AA AA AA AA AA 2L 4 G A/C AA AA AA AA AA AA AA AA AA AA 2L 5 G A/T AA AA TA AA AA AT AA TA AA AA 2L 6 G A/G AA AA AA AA AA AA AA AA AA AA 2L 7 G A/C AA AA AA AA AA AA AA AA AA AA 2L 8 G A/T AA AA AA AA AA AA AA AA AA AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA
haplotypes after 100 generations
After 100 generations the two mutations at the sites 1 and 5 got lost (drift is strong at N=10). But novel mutations appeared at the sites 2 and 8.
2L 1 G A/C AA AA AA AA AA AA AA AA AA AA 2L 2 G A/T AA TT AA TT AT AT TT TT TA TT 2L 3 G A/G AA AA AA AA AA AA AA AA AA AA 2L 4 G A/C AA AA AA AA AA AA AA AA AA AA 2L 5 G A/T AA AA AA AA AA AA AA AA AA AA 2L 6 G A/G AA AA AA AA AA AA AA AA AA AA 2L 7 G A/C AA AA AA AA AA AA AA AA AA AA 2L 8 G A/T TA AA TA AA TA AA AA AA AA AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA
In the following walkthrough we demonstrate simulations with mutations and selection.
We use the monomorphic haplotypes from the previous example.
We introduce a selected locus at site 6 (store in file sellocus.txt). The allele G has a selective benefit of 0.2 over the allele A
[s] 2L 6 A/G 0.2 0.5
Note the beneficial allele G at site 2L:6 does not exist in the base population. It may only appear during the simulations due to a de novo mutation.
Next we perform forward simulations for 100 generations using a mutation rate of 0.002 and the selected locus specified above. The haplotypes will be stored for all 10 generations in the folder output
mkdir output java -jar mim2.jar w --haplotypes-g0 monomorphic.mimhap --mutation-rate 0.002 --output-dir output --snapshots 10,20,30,40,50,60,70,80,90,100 --fitness sellocus.txt
haplotypes at generation 50
By generation 50 the beneficial allele at 2L:6 did not yet appear. However a derived allele at 2L:4 got already fixed due to drift.
2L 1 G A/C CC AC CA AA CC AC AC CC CA CC 2L 2 G A/T AA AA AA AA AA AA AA AA AA AA 2L 3 G A/G AA AA AA AA AA AA AA AA AA AA 2L 4 G A/C CC CC CC CA CC CC CC CC CC CC 2L 5 G A/T AA AA AA AA AA AA AA AA AA AA 2L 6 G A/G AA AA AA AA AA AA AA AA AA AA 2L 7 G A/C CC AC CA AA CC AA AC CC CA AA 2L 8 G A/T AA AA AA AA AA AA AA AA AA AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA
haplotypes at generation 90
By generation 90 the beneficial allele at 2L:6 finally appeared in the population
2L 1 G A/C CC CC CC CC CC CC CC CC CC CC 2L 2 G A/T AA AA AA AA AA AA AA AA AA AA 2L 3 G A/G AA AA AA AG AA AA AA AA AA AA 2L 4 G A/C CC CC CC CC CC CC CC CC CC CC 2L 5 G A/T TA AA AA AA AA TA AA AA AA AA 2L 6 G A/G GG GG AG AA GG GG GA AG GA GG 2L 7 G A/C CC CC CC CC CC CC CC CC CC CC 2L 8 G A/T AA AA TA TT AA AA AA AA AT AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA
haplotypes at generation 100
By generation 100 the beneficial allele at 2L:6 got fixed.
2L 1 G A/C CC CC CC CC CC CC CC CC CC CC 2L 2 G A/T AA AA AA AA AA AA AA AA AA AA 2L 3 G A/G AA AA AA AA AA AA AA AA AA AA 2L 4 G A/C CC CC CC CC CC CC CC CC CC CC 2L 5 G A/T TA AA AA TT TA AA AT TT AT AA 2L 6 G A/G GG GG GG GG GG GG GG GG GG GG 2L 7 G A/C CC AA AA CC CA AC CC CC CC CA 2L 8 G A/T AA AA AA AA AA AA AA AA AA AA 2L 9 G A/G AA AA AA AA AA AA AA AA AA AA 2L 10 G A/C AA AA AA AA AA AA AA AA AA AA