invade Wiki

Brought to you by: rokofler

Drift

Attachments

drift.png (21413 bytes)

Simulation of drift
- Conclusion
Material and Methods
- Commands used for the simulations
- R-code for visualizing the output

Simulation of drift

This validation was inspired by the work of Le Rouzic and Capy: http://www.genetics.org/content/169/2/1033
To test if drift is simulated correctly we exploit a basic population genetic insight: the probability of fixation of a neutral singleton in a diploid organism is:

pfix=1/2Ne

For different population sizes we thus expect the following number of fixed loci:

for N=250 we expect 20 fixed loci (10000/2*250)
for N=500 we expect 10 fixed loci (10000/2*500)
for N=1000 we expect 5 fixed loci (10000/2*1000)

To test if our simulations reproduce this fundamental property we randomly distributed 10.000 neutral TE insertions that do NOT transpose in a population. Each TE insertion had a starting frequency of 1/2N. We followed these loci for 20.000 generations and recorded the number of fixed insertions at the end. 100 replicates were used for each population size.

We obtain the following results

dashed line: theoretical expectations, ie. number of fixed loci expected for the given population size (top panel)
histogram: observed number of fixed loci in 100 replicates for each population size

Conclusion

Invade accurately models genetic drift, including drift at low population frequencies (1/2N)

Material and Methods

Commands used for the simulations

genome="mb:10,10,10,10,10"
cluster="1,1,1,1,1"
rec="cm_mb:4,4,4,4,4"
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 250 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=250"  > drift-n250.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 500 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=500"  > drift-n500.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 1000 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=1000"  > drift-n1000.txt &
# at the end
cat drift-n*|awk '$2==20000' > results.txt

R-code for visualizing the output

library(ggplot2)
library(RColorBrewer)
library(plyr)
library(gridExtra)
theme_set(theme_bw())



t<-read.table("results.txt")
names(t)<-c("rep","gen","c1","c2","c3","tes","c4","fix","c5","c6","c7","s2","c8","c9","c10","c11","c12","c13","s3","s4","s5","sid")
t$rep<-as.factor(t$rep)
t$sid <- factor(t$sid, levels=c("N=250", "N=500", "N=1000"))

gl<-ggplot(data=t,aes(x=fix))+geom_histogram(binwidth = 1)+facet_grid(.~sid)

postscript(file="/Users/robertkofler/analysis/tesim-analysis/analysis/2019-panic/graphs.eps",width=5,height=5)
plot(gl)
dev.off()

Wiki: Home