Menu

BioExample

Robert Kofler Christos Vlachos
Attachments
bioexample.png (28074 bytes)

Introduction

In this section we simulate a hypothetical experiment where a Drosophila melanogaster population evolves under starvation. Resistance to starvation is a quantitative trait and in order to emulate a realistic adaptation scenario we choose four targets of selection that confer higher resistance: kekkon-1 in chromosome arm 2L, CG5151 in chromosome arm 3L, pnt in chromosome 3R and Tre1 in chromosome X.
We obtained 205 haplotypes of the DGRP lines (Drosophila Genome Reference Panel; Freeze 2.0), and the recombination rate of Drosophila melanogaster.
We used a female to male ratio of 50:50, a hemizygous X-chromosome in males, no recombination in males and simulated truncating selection for 10 replicates and 40 generations, with 80% of the most starvation resistant individuals surviving truncation. Finally, we identified the selected loci with PoPoolation2 directly using the output of MimicrEE2 .

Materials and Methods

Input files

All input files used for the simulation of the experiment explained above can be found here.
PoPoolation2 scripts can be found here.

Simulations with MimicrEE2

java -jar mim2.jar qt --haplotypes-g0 mimicree_format_dgrp_snps_X_chromosome --recombination-rate comeron_sex_specific  --effect-size effect_size_1locus_X --heritability 1 --snapshots 40 --replicate-runs 10 --output-sync starvation_resistance.sync --threads 8 --selection-regime selection_regime_08 --sex sex-info

cmh-test from Popoolation2

First, we have to unzip the output sync file from MimicrEE2:

gzip -dc starvation_resistance.sync > starvation_resistance.unz.sync

and then perform the cmh test:

perl popoolation2-code/cmh-test.pl --input starvation_resistance.unz.sync --min-count 2 --min-coverage 2 --max-coverage 5000 --population 1-2,3-4,5-6,7-8,9-10,11-12,13-14,15-16,17-18,19-20 --output starvation_resistance.cmh

Plot results

The following R function can be used to produce a Manhattan plot to illustrate the results of the simulations:

##set current working directory
##the function takes three arguments as input: the file containg the results from the cmh test, the file that specifies the targets of selection and the p-value above which results are plotted.

setwd("path/to/inputfiles")

test.cmh<- read.table("starvation_resistance.cmh")
test.snp<- read.table("effect_size_1locus_X")

repl=10
cl.to.rm=3+repl*2
test.cmh<-test.cmh[-c(4:cl.to.rm,25)]


make.mhp<- function(cmh.file, snp.file,minp){

    cmh.file<-subset(cmh.file,V24>=minp)

    chrm<- cmh.file[,1]
    pos<-  cmh.file[,2]
    pval<- cmh.file[,4]

    chosen_snps<-paste(snp.file[,1],snp.file[,2],sep="")
    chosen_matr<- matrix(NA, ncol=4, nrow=length(chosen_snps))


    for(i in 1:length(chosen_snps)){

      chosen_matr[i,1]<-(as.character(cmh.file[which(paste(chrm,pos, sep="")==chosen_snps[i]),1]))
      chosen_matr[i,2]<-(as.character(cmh.file[which(paste(chrm,pos, sep="")==chosen_snps[i]),2]))
      chosen_matr[i,3]<-(as.character(cmh.file[which(paste(chrm,pos, sep="")==chosen_snps[i]),4]))
    }


    chosen_matr[,4]<-"chosen"


    '%!in%' <- function(x,y)!('%in%'(x,y))
    non_chosen_matr<- matrix(NA, ncol=4, nrow=nrow(cmh.file))
    for(i in 1:nrow(cmh.file)){

     chrmpos<- paste(cmh.file[i,1],cmh.file[i,2],sep="")
     if(chrmpos %!in% chosen_snps){
       non_chosen_matr[i,1]<-as.character(cmh.file[i,1])
       non_chosen_matr[i,2]<-cmh.file[i,2]
       non_chosen_matr[i,3]<-cmh.file[i,4]
     }
    }
    non_chosen_matr[,4]<-"non_chosen"
    to.rm<- which(is.na(non_chosen_matr[,1]))   


    non_chosen_matr<- non_chosen_matr[-to.rm,]



    file_for_mhp_non_chosen<-data.frame(chrom=non_chosen_matr[,1], position=as.numeric(non_chosen_matr[,2]), pvalues=as.numeric(non_chosen_matr[,3]), snps=non_chosen_matr[,4])  
    file_for_mhp_chosen<-data.frame(chrom=chosen_matr[,1], position=as.numeric(chosen_matr[,2]),pvalues=as.numeric(chosen_matr[,3]),snps=chosen_matr[,4])  

    file_for_mhp<- rbind(file_for_mhp_non_chosen,file_for_mhp_chosen)  
    print(is.numeric(file_for_mhp[,2]))
    print(is.numeric(file_for_mhp[,3]))


    p<-ggplot(file_for_mhp, aes(x=position, y=pvalues, color=snps, size=snps)) + geom_point()+ scale_color_manual(values=c("grey", "firebrick")) +
      scale_x_continuous(breaks=c(0,5000000,10000000,15000000,20000000,25000000),
                         labels=c("0","5","10","15","20","25"))+
      scale_y_continuous(breaks=c(0,500,1000,1500), labels=c("0","500","1000","1500"))+
      #ylim(30, 1500)+
      theme_bw() + 
      labs(x="chromosome coordinates (Mb)", y=expression(-log[10](P))) + scale_size_manual(values=c(0.3,1))+
      facet_grid(.~chrom, scales="free_x", space="free_x")+
      theme(axis.text.x = element_text(size=12), axis.text.y=element_text(size=12),axis.title.x = element_text(size=12), axis.title.y=element_text(size=12),legend.position="none")
    return(p)
}


mhp<-make.mhp(test.cmh,test.snp,0)

png(file="/path/to/folder/filename.png",width=600,height=400)
print(mhp)
dev.off()

Results


Related

Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.