MimicrEE2 allows to provide the recombination rate for windows having an arbitrary size and genomic location. The recombination rate may be provided as the lambda value of a Poisson distribution, as the recombination fraction or as centi Morgan per Megabasepair. We validated all three approaches.
The recombination rate may be provided as the mean (lambda) of a Poisson distribution. MimicrEE2 uses Poisson distributed random numbers as the number of crossovers in a given window (see [InputFiles]).
We tested whether MimicrEE2 correctly generates Poisson distributed random numbers.
Using the MimicrEE2 core-library we generated 30.000 Poisson distributed random numbers for each of four lambda values (0.5, 1, 4, 10) and compared the obtained distribution of random numbers to the expected one.
In this graph the grey histogram shows the distribution of random numbers generated by MimicrEE2 and the black line the expected distribution. This graph demonstrates that MimicrEE2 correctly generates Poisson distributed random numbers.
MimicrEE2 allows to provide the recombination rate as "recombination fraction" (see [InputFiles]). Assuming a marker at each end of a given window, the recombination fraction is the fraction of gametes with recombined markers. Due to double crossovers the recombination fraction is smaller than the fraction of gametes with at least one cross over.
We thus need a mapping function to translate the recombination fraction into the lambda value of a Poisson distribution.
In MimicrEE2 we use Haldane's (1919) mapping function
m = -0.5*ln(1-2RF)
where RF is the recombination fraction and m the distance (which will be used as the lambda-value of our Poisson distribution).
We tested whether this mapping function provides the correct number of crossovers.
Using the MimicrEE2 core library we first translated the RF into m (using Haldane's mapping function) for 19 RF values ranging from 0.025 to 0.475 (these are the expected recombination fractions).
We than used m as the lambda of a Poisson distribution and generated 10000 Poisson distribute random numbers for each of the 19 RFs. The Poisson random numbers represent the crossovers of a given window.
Finally we computed the observed recombination fraction as the fraction of uneven crossovers (e.g. 1, 3, 5...; zero is excluded).
In the above graph we compare the expected and the observed recombination fractions. A dot represents the mean of 10000 random values and the dashed line the diagonal. This graphic shows that MimicrEE2 accurately reproduces the expected recombination fraction.
MimicrEE2 also allows to provide the recombination rate in centi-Morgan per megabasepair (see [InputFiles]). Haldane's map function is again used to compute the mean value of a Poisson distribution.
We validated this feature by simulating a genomic region of size 1Mb consisting of 8 recombination-windows having various size: 400kb, 250kb, 150kb, 100kb, 50kb, 25kb, 15kb, 10kb. The sum of these 8 windows is exactly one Mb.
An identical recombination rate in cM/Mb was provided to all 8 windows (expected recombination rate). Since the total size of the genomic region is 1Mb the recombination rate in cM/Mb is equivalent to the recombination fraction (in Morgan). To obtain the observed recombination fraction we summed the number of crossover events for the 8 windows and again computed the fraction of unequal crossovers (e.g. 1, 3, 5...; zero is excluded).
In the above graph we compare the expected recombination rate (cM/Mb) and the observed recombination fraction. A dot represents the mean of 10000 random values and the dashed line the diagonal.
This graphic shows that MimicrEE2 accurately reproduces the expected recombination fraction when the recombination rate is provided in centi-Morgan per megabasepair.
The following Java code was used to generate the raw data for validating the recombination rate implemented in MimicrEE2. The entire code is available in the Code section of MimicrEE2 (revision 54 was used).
:::Java
package mim2.test.recombinationTest;
import mimcore.data.misc.Tuple;
import mimcore.data.recombination.RecombinationWindow;
import mimcore.io.recombination.RRRRecFraction;
import mimcore.io.recombination.RRRcMpMb;
import mimcore.io.recombination.RecombinationRateReader;
import java.util.ArrayList;
import java.util.Random;
/**
* Created by robertkofler on 08/11/2017.
*/
public class RecombinationTester {
public static void centiMorgan()
{
double[] rfs={2.5, 5.0, 7.5, 10.0, 12.5, 15.0, 17.5, 20.0, 22.5, 25.0, 27.5, 30.0, 32.5, 35.0, 37.5, 40.0 , 42.5, 45.,47.5};
int n=10000;
Random r=new Random();
ArrayList<Tuple<Double,Integer>> tups=new ArrayList<Tuple<Double,Integer>>();
for(double rf: rfs)
{
RecombinationWindow rw1=new RecombinationWindow(null,0,4000000,RRRcMpMb.haldanetransform(rf,400000));
RecombinationWindow rw2=new RecombinationWindow(null,0,2500000,RRRcMpMb.haldanetransform(rf,250000));
RecombinationWindow rw3=new RecombinationWindow(null,0,1500000,RRRcMpMb.haldanetransform(rf,150000));
RecombinationWindow rw4=new RecombinationWindow(null,0,1000000,RRRcMpMb.haldanetransform(rf,100000));
RecombinationWindow rw5=new RecombinationWindow(null,0,500000,RRRcMpMb.haldanetransform(rf,50000));
RecombinationWindow rw6=new RecombinationWindow(null,0,250000,RRRcMpMb.haldanetransform(rf,25000));
RecombinationWindow rw7=new RecombinationWindow(null,0,150000,RRRcMpMb.haldanetransform(rf,15000));
RecombinationWindow rw8=new RecombinationWindow(null,0,100000,RRRcMpMb.haldanetransform(rf,10000));
for(int i=0; i<n; i++){
int recs=rw1.getRecombinationEvents(r)+
rw2.getRecombinationEvents(r)+
rw3.getRecombinationEvents(r)+
rw4.getRecombinationEvents(r)+
rw5.getRecombinationEvents(r)+
rw6.getRecombinationEvents(r)+
rw7.getRecombinationEvents(r)+
rw8.getRecombinationEvents(r);
tups.add(new Tuple<Double,Integer>(rf,recs));
}
}
RecombinationWriter w=new RecombinationWriter("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/centiMorgan.txt");
w.write(tups);
}
public static void haldanesMapFunction()
{
double[] rfs={0.025,0.05,0.075,0.1,0.125,0.15,0.175,0.2,0.225,0.25,0.275,0.3,0.325,0.35,0.375,0.4,0.425,0.45,0.475};
int n=10000;
Random r=new Random();
ArrayList<Tuple<Double,Integer>> tups=new ArrayList<Tuple<Double,Integer>>();
for(double rf: rfs)
{
double lambda = RRRRecFraction.haldane1919mapFunction(rf);
RecombinationWindow rw=new RecombinationWindow(null,0,1,lambda);
for(int i=0; i<n; i++){
int recs=rw.getRecombinationEvents(r);
tups.add(new Tuple<Double,Integer>(rf,recs));
}
}
RecombinationWriter w=new RecombinationWriter("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/haldane1919.txt");
w.write(tups);
}
public static void generatePoisson()
{
double[] poissonvalues={0.5,1,4,10};
int n=30000;
Random r=new Random();
ArrayList<Tuple<Double,Integer>> tups=new ArrayList<Tuple<Double,Integer>>();
for(double d: poissonvalues)
{
RecombinationWindow rw=new RecombinationWindow(null,0,1,d);
for(int i=0; i<n; i++){
int recs=rw.getRecombinationEvents(r);
tups.add(new Tuple<Double,Integer>(d,recs));
}
}
RecombinationWriter w=new RecombinationWriter("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/poisson.txt");
w.write(tups);
}
}
The following R code was used to generate the plot shown above.
:::R
library(ggplot2)
theme_set(theme_bw())
# compute the expected distribution of random numbers using dpois()
n=30000
xv<-seq(0,20,1)
yv<-dpois(xv,10)
exp<-data.frame(lambda=rep(0.5,21),x=xv, exp=dpois(xv,0.5)*n)
exp<-rbind(exp,data.frame(lambda=rep(1,21),x=xv, exp=dpois(xv,1)*n))
exp<-rbind(exp,data.frame(lambda=rep(4,21),x=xv,exp=dpois(xv,4)*n))
exp<-rbind(exp,data.frame(lambda=rep(10,21),x=xv,exp=dpois(xv,10)*n))
exp$lambda<-as.factor(exp$lambda)
# load the random numbers generated by MimcrEE2
md<-read.table("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/poisson.txt")
names(md)<-c("lambda","value")
md$lambda<-as.factor(md$lambda)
g<-ggplot()+geom_histogram(data=md,aes(value),binwidth = 1,fill="grey")+facet_wrap(~lambda)+geom_line(data=exp,aes(x=x,y=exp))+xlim(-1,20)+xlab("number of cross overs")
pdf("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/poisson.pdf",width=6,height=6)
plot(g)
dev.off()
The following R code was used to generate the plot shown above.
:::R
mod<-function(x,m)
{
# modulo operator
t1<-floor(x/m)
return(x-t1*m)
}
is_recombined<-function(m)
{
# is the number of cross overs uneven
# (double cross overs don't lead to recombination of the markers at
# the end of the window)
x<-mod(m,2)
return(x>0)
}
library(ggplot2)
theme_set(theme_bw())
## load the number of cross overs generated by MimicrEE2
md<-read.table("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/haldane1919.txt")
names(md)<-c("rf_exp","crossovers")
md$is_recombined<-is_recombined(md$crossovers)
mda<-aggregate(is_recombined~rf_exp,data=md,FUN=sum)
mda$rf_obs<-mda$is_recombined/10000
lin<-data.frame(x=c(0,0.5),y=c(0,0.5))
g<-ggplot()+geom_point(data=mda,aes(x=rf_exp,y=rf_obs))+geom_line(data=lin,aes(x=x,y=y),linetype=2,col="grey")+xlab("expected fraction of recombinants")+ylab("observed fraction of recombinants")
pdf("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/haldane1919.pdf",width=5,height=5)
plot(g)
dev.off()
The following R code was used to generate the plot shown above.
:::R
mod<-function(x,m)
{
# modulo operator
t1<-floor(x/m)
return(x-t1*m)
}
is_recombined<-function(m)
{
# is the number of cross overs uneven
# (double cross overs don't lead to recombination of the markers at
# the end of the window)
x<-mod(m,2)
return(x>0)
}
library(ggplot2)
theme_set(theme_bw())
## load the number of cross overs generated by MimicrEE2
md<-read.table("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/centiMorgan.txt")
names(md)<-c("rf_exp","crossovers")
md$is_recombined<-is_recombined(md$crossovers)
mda<-aggregate(is_recombined~rf_exp,data=md,FUN=sum)
mda$rf_obs<-mda$is_recombined/10000
lin<-data.frame(x=c(0,50),y=c(0,0.5))
g<-ggplot()+geom_point(data=mda,aes(x=rf_exp,y=rf_obs))+geom_line(data=lin,aes(x=x,y=y),linetype=2,col="grey")+xlab("expected recombination in cM/Mb")+ylab("observed fraction of recombinants")
pdf("/Users/robertkofler/dev/mimicree2/theoreyvalidation/recombination/centiMorgan.pdf",width=5,height=5)
plot(g)
dev.off()