This validation was inspired by the work of Le Rouzic and Capy: http://www.genetics.org/content/169/2/1033
To test if drift is simulated correctly we exploit a basic population genetic insight: the probability of fixation of a neutral singleton in a diploid organism is:
pfix=1/2Ne
For different population sizes we thus expect the following number of fixed loci:
To test if our simulations reproduce this fundamental property we randomly distributed 10.000 neutral TE insertions that do NOT transpose in a population. Each TE insertion had a starting frequency of 1/2N. We followed these loci for 20.000 generations and recorded the number of fixed insertions at the end. 100 replicates were used for each population size.
We obtain the following results

Invade accurately models genetic drift, including drift at low population frequencies (1/2N)
genome="mb:10,10,10,10,10"
cluster="1,1,1,1,1"
rec="cm_mb:4,4,4,4,4"
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 250 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=250" > drift-n250.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 500 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=500" > drift-n500.txt &
java -jar invade.jar cluster --min-w 0.1 --genome $genome --rr $rec --cluster $cluster --u 0.0 --N 1000 --gen 20000 --basepop seg:10000 --rep 100 --silent --steps 10000 --cluster-insertions 1 --simid "N=1000" > drift-n1000.txt &
# at the end
cat drift-n*|awk '$2==20000' > results.txt
library(ggplot2)
library(RColorBrewer)
library(plyr)
library(gridExtra)
theme_set(theme_bw())
t<-read.table("results.txt")
names(t)<-c("rep","gen","c1","c2","c3","tes","c4","fix","c5","c6","c7","s2","c8","c9","c10","c11","c12","c13","s3","s4","s5","sid")
t$rep<-as.factor(t$rep)
t$sid <- factor(t$sid, levels=c("N=250", "N=500", "N=1000"))
gl<-ggplot(data=t,aes(x=fix))+geom_histogram(binwidth = 1)+facet_grid(.~sid)
postscript(file="/Users/robertkofler/analysis/tesim-analysis/analysis/2019-panic/graphs.eps",width=5,height=5)
plot(gl)
dev.off()