Gowinda is a multi-threaded Java application that allows an unbiased analysis of gene set enrichment for Genome Wide Association Studies. Classical analysis of gene set (e.g.: Gene Ontology) enrichment assumes that all genes are sampled independently from each other with the same probability. These assumptions are violated in Genome Wide Association (GWA) studies since (i) longer genes typically have more SNPs resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters. Gowinda has been specifically designed to test for enrichment of gene sets in GWA studies. We show that Gene Ontology (GO) tests on GWA data could result in a substantial number of false positive GO terms. Permutation tests implemented in Gowinda eliminate these biases, but maintain sufficient power to detect enrichment of GO terms.
This is best illustrated by an example. The following work (PNAS) had to be retracted because overlapping genes were not considered during analysis of gene set enrichment.
http://www.pnas.org/content/111/26/9657
(thank's to Leon French for the link).
Gowinda can be downloaded from the main page
The Java code can be obtained with subversion (see the "Code" tab).
Gowinda is written in Java, thus no installation is required (see Tutorial for usage).
Gowinda data sets for humans (+interesting results): http://fromthedata.blogspot.ca/2014/02/axon-guidance-and-snps.html
Gowinda heavily builds upon the work of FuncAssociate2 and GoMiner, so please cite the appropriate tool too when using the corresponding GO association files:
PoPoolation: A pipeline for analyzing pooled next generation sequencing data for single populations. Currently PoPoolation allows to calculate Tajima’s Pi, Watterson’s Theta and Tajima’s D for reference sequences using a sliding window approach
PoPoolation2: Allows analyzing the population frequencies of SNPs from two or more populations. It may be used to identify differentiation between populations or to analyze data from genome wide association studies.
PoPoolation TE: A quick and simple pipeline for the analysis of transposable element insertion frequencies in populations from pooled next generation sequencing data. PoPoolation TE identifies TE insertions that are present in the reference genome as well as novel TE insertions and estimates their population frequencies. This also allows for an comparision of TE insertion frequencies between different populations
PoPoolation DB: A user-friendly web-based database for the retrieval of natural variation in Drosophila melanogaster
You may enjoy some background music while using Gowinda