Menu

Drift

Robert Kofler
Attachments
drift.png (21413 bytes)

Simulation of drift

This validation was inspired by the work of Le Rouzic and Capy: http://www.genetics.org/content/169/2/1033
To test if drift is simulated correctly we exploit a basic population genetic insight: the probability of fixation of a neutral singleton in a diploid organism is:

pfix=1/2Ne

For different population sizes we thus expect the following number of fixed loci:

  • for N=250 we expect 20 fixed loci (10000/2*250)
  • for N=500 we expect 10 fixed loci (10000/2*500)
  • for N=1000 we expect 5 fixed loci (10000/2*1000)

To test if our simulations reproduce this fundamental property we randomly distributed 10.000 neutral TE insertions that do NOT transpose in a population. Each TE insertion had a starting frequency of 1/2N. We followed these loci for 20.000 generations and recorded the number of fixed insertions at the end. 100 replicates were used for each population size.

We obtain the following results

  • dashed line: theoretical expectations, ie. number of fixed loci expected for the given population size (top panel)
  • histogram: observed number of fixed loci in 100 replicates for each population size

Conclusion

Invade accurately models genetic drift, including drift at low population frequencies (1/2N)

Material and Methods

Commands used for the simulations

genome="mb:10,10,10,10,10"
cluster="1,1,1,1,1"
rec="cm_mb:4,4,4,4,4"
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 250 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=250"  > drift-n250.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 500 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=500"  > drift-n500.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 1000 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=1000"  > drift-n1000.txt &
# at the end
cat drift-n*|awk '$2==20000' > results.txt

R-code for visualizing the output

library(ggplot2)
library(RColorBrewer)
library(plyr)
library(gridExtra)
theme_set(theme_bw())



t<-read.table("results.txt")
names(t)<-c("rep","gen","c1","c2","c3","tes","c4","fix","c5","c6","c7","s2","c8","c9","c10","c11","c12","c13","s3","s4","s5","sid")
t$rep<-as.factor(t$rep)
t$sid <- factor(t$sid, levels=c("N=250", "N=500", "N=1000"))

gl<-ggplot(data=t,aes(x=fix))+geom_histogram(binwidth = 1)+facet_grid(.~sid)

postscript(file="/Users/robertkofler/analysis/tesim-analysis/analysis/2019-panic/graphs.eps",width=5,height=5)
plot(gl)
dev.off()

Related

Wiki: Home

MongoDB Logo MongoDB