<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Manual</title><link>https://sourceforge.net/p/gowinda/wiki/Manual/</link><description>Recent changes to Manual</description><atom:link href="https://sourceforge.net/p/gowinda/wiki/Manual/feed" rel="self"/><language>en</language><lastBuildDate>Sun, 27 Aug 2017 16:50:33 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/gowinda/wiki/Manual/feed" rel="self" type="application/rss+xml"/><item><title>Manual modified by Robert Kofler</title><link>https://sourceforge.net/p/gowinda/wiki/Manual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Robert Kofler</dc:creator><pubDate>Sun, 27 Aug 2017 16:50:33 -0000</pubDate><guid>https://sourceforge.netbd5b08c420f337cefc116095d9f7b2691a138083</guid></item><item><title>Manual modified by Robert Kofler</title><link>https://sourceforge.net/p/gowinda/wiki/Manual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -1,24 +1,4 @@
-  * Introduction
-  * Requirements
-    * Annotation (.gtf)
-    * Gene set
-    * Total SNP file
-    * Candidate SNP file
-  * Gowinda
-    * Parameters
-      * SNP to gene mapping
-      * simulations
-    * Algorithm
-      * Analysis mode SNP
-      * Analysis mode gene
-      * Calculate the p-value of gene set enrichment
-      * FDR correction
-    * Output
-  * Scripts
-    * Gff2Gtf .py
-    * Gominer2FuncAssociate .py
-  * Use a custom pathway database
-  * Links
+[TOC]

 # Introduction

@@ -189,12 +169,13 @@
 ### Calculate the p-value of gene set enrichment

 After the simulations have been completed, Gowinda derives an empirical null distribution of gene abundance for every gene set. An example of such an empirical null distribution for a single gene set and 1 million simulations may look like in the following example: 
-
-![](http://gowinda.googlecode.com/files/gowinda_edist.png)
+[[img src=gowinda_edist.png]]
+

 Let the number of simulations be `S`, the gene set for which the p-value should be calculated be `g`, the number of genes found for the given gene set within a single simulation be cgs and the number of candidate genes be cgcand than the p-value of enrichment (pg) for the given gene set can be calculated according to the following equation: 

-![](http://gowinda.googlecode.com/files/equation_pvalue.png)
+[[img src=equation_pvalue.png]]
+

 In the example shown above cgcand would be 18 genes, `S` would be 1 million simulations and the sum of cgs larger or equal than 18 (cgcand) about 47.000. Thus the p-value pg for `intracellular protein transport` would be 47.000/1.000.000 = 0.047 

@@ -206,7 +187,8 @@

 In detail, Gowinda implements the following algorithm for calculating the FDR: Let `G` be the number of gene sets and `S` the number of simulations. Than an observed (Robs) and an expected (Rexp) count of gene sets having p-values smaller or equal than a threshold (`P` = the p-value which should be FDR corrected) can be computed. The FDR can subsequently simply be calculated by dividing Rexp by Robs. 

-![](http://gowinda.googlecode.com/files/fdr_equations.png)
+[[img src=fdr_equations.png]]
+

 ## Output

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Robert Kofler</dc:creator><pubDate>Sun, 27 Aug 2017 16:49:43 -0000</pubDate><guid>https://sourceforge.net7f40770a26e2a05f21601d4be4afa23594d5603b</guid></item><item><title>Manual modified by Anonymous</title><link>https://sourceforge.net/p/gowinda/wiki/Manual/</link><description>&lt;div class="markdown_content"&gt;&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Requirements&lt;ul&gt;
&lt;li&gt;Annotation (.gtf)&lt;/li&gt;
&lt;li&gt;Gene set&lt;/li&gt;
&lt;li&gt;Total SNP file&lt;/li&gt;
&lt;li&gt;Candidate SNP file&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Gowinda&lt;ul&gt;
&lt;li&gt;Parameters&lt;/li&gt;
&lt;li&gt;SNP to gene mapping&lt;/li&gt;
&lt;li&gt;simulations&lt;/li&gt;
&lt;li&gt;Algorithm&lt;/li&gt;
&lt;li&gt;Analysis mode SNP&lt;/li&gt;
&lt;li&gt;Analysis mode gene&lt;/li&gt;
&lt;li&gt;Calculate the p-value of gene set enrichment&lt;/li&gt;
&lt;li&gt;FDR correction&lt;/li&gt;
&lt;li&gt;Output&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Scripts&lt;ul&gt;
&lt;li&gt;Gff2Gtf .py&lt;/li&gt;
&lt;li&gt;Gominer2FuncAssociate .py&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Use a custom pathway database&lt;/li&gt;
&lt;li&gt;Links&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="introduction"&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Gowinda is a multi-threaded Java application that allows an unbiased analysis of gene set enrichment for Genome Wide Association Studies. Classical analysis of gene set (e.g.: Gene Ontology) enrichment assumes that all genes are sampled independently from each other with the same probability. These assumptions are violated in Genome Wide Association (GWA) studies since (i) longer genes typically have more SNPs resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters. Gowinda has been specifically designed to test for enrichment of gene sets in GWA studies. We show that Gene Ontology (GO) tests on GWA data could result in a substantial number of false positive GO terms. Permutation tests implemented in Gowinda eliminate these biases, but maintain sufficient power to detect enrichment of GO terms. &lt;/p&gt;
&lt;p&gt;For a validation of Gowinda please see: &lt;a href="http://code.google.com/p/gowinda/wiki/Validation" rel="nofollow"&gt;http://code.google.com/p/gowinda/wiki/Validation&lt;/a&gt; In the validation we show that Gowinda yields highly reliable results (by comparision with GoMiner) efficiently corrects for the gene length bias while still identifying significantly overrepresented GO categories. We also demonstrate that the gene length bias has a tremendous influence on the GO analysis potentially causing an substantial amount of false positive GO categories. &lt;/p&gt;
&lt;h1 id="requirements"&gt;Requirements&lt;/h1&gt;
&lt;p&gt;Gowinda requires the following software: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Java 6 or higher &lt;br /&gt;
Furthermore the following input files are required &lt;/li&gt;
&lt;li&gt;a file containing the annotation of the genome in .gtf &lt;/li&gt;
&lt;li&gt;a gene set file, containing for every gene set (e.g.: GO category) a list of the associated gene IDs &lt;/li&gt;
&lt;li&gt;a file containing the total set of SNPs &lt;/li&gt;
&lt;li&gt;a file containing the candidate SNPs &lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="annotation-gtf"&gt;Annotation (.gtf)&lt;/h2&gt;
&lt;p&gt;The annotation must be in the &lt;code&gt;.gtf&lt;/code&gt; format: &lt;a href="http://mblab.wustl.edu/GTF22.html" rel="nofollow"&gt;http://mblab.wustl.edu/GTF22.html&lt;/a&gt; Gowinda however only requires the attribute &lt;code&gt;gene_id&lt;/code&gt;, the attribute &lt;code&gt;transcript_id&lt;/code&gt; is optional. Following an example of a minimal annotation file: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="mi"&gt;2L&lt;/span&gt;      &lt;span class="n"&gt;FlyBase&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt;    &lt;span class="mi"&gt;8193&lt;/span&gt;    &lt;span class="mi"&gt;9484&lt;/span&gt;    &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="o"&gt;+&lt;/span&gt;       &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="n"&gt;gene_id&lt;/span&gt; &lt;span class="s"&gt;"CG11023"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;      &lt;span class="n"&gt;FlyBase&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt;    &lt;span class="mi"&gt;8193&lt;/span&gt;    &lt;span class="mi"&gt;8589&lt;/span&gt;    &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="o"&gt;+&lt;/span&gt;       &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="n"&gt;gene_id&lt;/span&gt; &lt;span class="s"&gt;"CG11023"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;      &lt;span class="n"&gt;FlyBase&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt;    &lt;span class="mi"&gt;8668&lt;/span&gt;    &lt;span class="mi"&gt;9484&lt;/span&gt;    &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="o"&gt;+&lt;/span&gt;       &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="n"&gt;gene_id&lt;/span&gt; &lt;span class="s"&gt;"CG11023"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;      &lt;span class="n"&gt;FlyBase&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt;    &lt;span class="mi"&gt;9839&lt;/span&gt;    &lt;span class="mi"&gt;11344&lt;/span&gt;   &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="o"&gt;-&lt;/span&gt;       &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="n"&gt;gene_id&lt;/span&gt; &lt;span class="s"&gt;"CG2671"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;      &lt;span class="n"&gt;FlyBase&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt;    &lt;span class="mi"&gt;11410&lt;/span&gt;   &lt;span class="mi"&gt;11518&lt;/span&gt;   &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="o"&gt;-&lt;/span&gt;       &lt;span class="p"&gt;.&lt;/span&gt;       &lt;span class="n"&gt;gene_id&lt;/span&gt; &lt;span class="s"&gt;"CG2671"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Furthermore the feature column (column 3) must either contain &lt;code&gt;exon&lt;/code&gt; or &lt;code&gt;CDS&lt;/code&gt;, all other features will be ignored. Column 2 (source), column 6 (score), and column 8 (offset) have no influence on the results. &lt;/p&gt;
&lt;p&gt;It is not necessary that the entries are unique, Gowinda is internally reducing multiple copies of the same entry to a single one. &lt;/p&gt;
&lt;p&gt;Gowinda does not support the &lt;code&gt;.gff&lt;/code&gt; format as it less transparent as the &lt;code&gt;.gtf&lt;/code&gt; format (e.g.: in order to obtain the gene ID for a given exon it is necessary to traverse the hierarchy exon -&amp;gt; mRNa -&amp;gt; gene) and has several sources of inconsistencies (exons without parent (mRNA), exons with several parents). We however provide a script for converting a &lt;code&gt;.gff&lt;/code&gt; file into a &lt;code&gt;.gtf&lt;/code&gt; file: &lt;/p&gt;
&lt;p&gt;&lt;a href="http://gowinda.googlecode.com/files/Gff2Gtf.py" rel="nofollow"&gt;http://gowinda.googlecode.com/files/Gff2Gtf.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The user may thus assure that the annotation is correctly converted before providing it to Gowinda. &lt;/p&gt;
&lt;h2 id="gene-set"&gt;Gene set&lt;/h2&gt;
&lt;p&gt;The gene set file for Gene Ontology terms may be directly obtained from the download section of FuncAssociate2 (&lt;a href="http://llama.mshri.on.ca/funcassociate/download_go_associations" rel="nofollow"&gt;http://llama.mshri.on.ca/funcassociate/download_go_associations&lt;/a&gt;) or indirectly from GoMiner (see tutorial: &lt;a href="http://code.google.com/p/gowinda/wiki/Tutorial" rel="nofollow"&gt;http://code.google.com/p/gowinda/wiki/Tutorial&lt;/a&gt;) &lt;/p&gt;
&lt;p&gt;Following an example of a gene set file for Gene Ontology &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000002&lt;/span&gt;      &lt;span class="n"&gt;mitochondrial&lt;/span&gt; &lt;span class="n"&gt;genome&lt;/span&gt; &lt;span class="n"&gt;maintenance&lt;/span&gt;        &lt;span class="n"&gt;CG11077&lt;/span&gt; &lt;span class="n"&gt;CG33650&lt;/span&gt; &lt;span class="n"&gt;CG4337&lt;/span&gt; &lt;span class="n"&gt;CG5924&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;      &lt;span class="n"&gt;reproduction&lt;/span&gt;    &lt;span class="n"&gt;CG10112&lt;/span&gt; &lt;span class="n"&gt;CG10128&lt;/span&gt; &lt;span class="n"&gt;CG1262&lt;/span&gt; &lt;span class="n"&gt;CG13873&lt;/span&gt; &lt;span class="n"&gt;CG14034&lt;/span&gt; &lt;span class="n"&gt;CG15117&lt;/span&gt; &lt;span class="n"&gt;CG15616&lt;/span&gt; &lt;span class="n"&gt;CG1656&lt;/span&gt; &lt;span class="n"&gt;CG17011&lt;/span&gt; &lt;span class="n"&gt;CG17097&lt;/span&gt; &lt;span class="n"&gt;CG17673&lt;/span&gt; &lt;span class="n"&gt;CG17799&lt;/span&gt; &lt;span class="n"&gt;CG17843&lt;/span&gt; &lt;span class="n"&gt;CG1803&lt;/span&gt; &lt;span class="n"&gt;CG2665&lt;/span&gt; &lt;span class="n"&gt;CG2668&lt;/span&gt; &lt;span class="n"&gt;CG2852&lt;/span&gt; &lt;span class="n"&gt;CG30450&lt;/span&gt; &lt;span class="n"&gt;CG30473&lt;/span&gt; &lt;span class="n"&gt;CG31680&lt;/span&gt; &lt;span class="n"&gt;CG31&lt;/span&gt;
&lt;span class="mi"&gt;704&lt;/span&gt; &lt;span class="n"&gt;CG31872&lt;/span&gt; &lt;span class="n"&gt;CG31883&lt;/span&gt; &lt;span class="n"&gt;CG31941&lt;/span&gt; &lt;span class="n"&gt;CG32203&lt;/span&gt; &lt;span class="n"&gt;CG32498&lt;/span&gt; &lt;span class="n"&gt;CG32667&lt;/span&gt; &lt;span class="n"&gt;CG33943&lt;/span&gt; &lt;span class="n"&gt;CG34033&lt;/span&gt; &lt;span class="n"&gt;CG34034&lt;/span&gt; &lt;span class="n"&gt;CG34102&lt;/span&gt; &lt;span class="n"&gt;CG3662&lt;/span&gt; &lt;span class="n"&gt;CG3801&lt;/span&gt; &lt;span class="n"&gt;CG42461&lt;/span&gt; &lt;span class="n"&gt;CG42462&lt;/span&gt; &lt;span class="n"&gt;CG42466&lt;/span&gt; &lt;span class="n"&gt;CG42468&lt;/span&gt; &lt;span class="n"&gt;CG42469&lt;/span&gt; &lt;span class="n"&gt;CG42472&lt;/span&gt; &lt;span class="n"&gt;CG42474&lt;/span&gt; &lt;span class="n"&gt;CG42475&lt;/span&gt; &lt;span class="n"&gt;CG42477&lt;/span&gt; &lt;span class="n"&gt;CG42478&lt;/span&gt; &lt;span class="n"&gt;CG42479&lt;/span&gt; &lt;span class="n"&gt;CG42&lt;/span&gt;
&lt;span class="mi"&gt;480&lt;/span&gt; &lt;span class="n"&gt;CG42482&lt;/span&gt; &lt;span class="n"&gt;CG42483&lt;/span&gt; &lt;span class="n"&gt;CG42485&lt;/span&gt; &lt;span class="n"&gt;CG42564&lt;/span&gt; &lt;span class="n"&gt;CG42602&lt;/span&gt; &lt;span class="n"&gt;CG42603&lt;/span&gt; &lt;span class="n"&gt;CG42604&lt;/span&gt; &lt;span class="n"&gt;CG42605&lt;/span&gt; &lt;span class="n"&gt;CG42606&lt;/span&gt; &lt;span class="n"&gt;CG42607&lt;/span&gt; &lt;span class="n"&gt;CG42608&lt;/span&gt; &lt;span class="n"&gt;CG42609&lt;/span&gt; &lt;span class="n"&gt;CG4546&lt;/span&gt; &lt;span class="n"&gt;CG4706&lt;/span&gt; &lt;span class="n"&gt;CG4847&lt;/span&gt; &lt;span class="n"&gt;CG4986&lt;/span&gt; &lt;span class="n"&gt;CG6555&lt;/span&gt; &lt;span class="n"&gt;CG6690&lt;/span&gt; &lt;span class="n"&gt;CG6917&lt;/span&gt; &lt;span class="n"&gt;CG7157&lt;/span&gt; &lt;span class="n"&gt;CG8137&lt;/span&gt; &lt;span class="n"&gt;CG8462&lt;/span&gt; &lt;span class="n"&gt;CG8622&lt;/span&gt; &lt;span class="n"&gt;CG8626&lt;/span&gt; &lt;span class="n"&gt;CG8982&lt;/span&gt;
 &lt;span class="n"&gt;CG9024&lt;/span&gt; &lt;span class="n"&gt;CG9029&lt;/span&gt; &lt;span class="n"&gt;CG9074&lt;/span&gt; &lt;span class="n"&gt;CG9111&lt;/span&gt; &lt;span class="n"&gt;CG9334&lt;/span&gt; &lt;span class="n"&gt;CG9997&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000009&lt;/span&gt;      &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;mannosyltransferase&lt;/span&gt; &lt;span class="n"&gt;activity&lt;/span&gt;  &lt;span class="n"&gt;CG8412&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000010&lt;/span&gt;      &lt;span class="n"&gt;trans&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;hexaprenyltranstransferase&lt;/span&gt; &lt;span class="n"&gt;activity&lt;/span&gt;       &lt;span class="n"&gt;CG10585&lt;/span&gt; &lt;span class="n"&gt;CG31005&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000012&lt;/span&gt;      &lt;span class="n"&gt;single&lt;/span&gt; &lt;span class="n"&gt;strand&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt; &lt;span class="n"&gt;repair&lt;/span&gt;      &lt;span class="n"&gt;CG4208&lt;/span&gt; &lt;span class="n"&gt;CG5316&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000014&lt;/span&gt;      &lt;span class="n"&gt;single&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;stranded&lt;/span&gt; &lt;span class="n"&gt;DNA&lt;/span&gt; &lt;span class="n"&gt;specific&lt;/span&gt; &lt;span class="n"&gt;endodeoxyribonuclease&lt;/span&gt; &lt;span class="n"&gt;activity&lt;/span&gt;     &lt;span class="n"&gt;CG10215&lt;/span&gt; &lt;span class="n"&gt;CG10670&lt;/span&gt; &lt;span class="n"&gt;CG10890&lt;/span&gt; &lt;span class="n"&gt;CG2990&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000015&lt;/span&gt;      &lt;span class="n"&gt;phosphopyruvate&lt;/span&gt; &lt;span class="n"&gt;hydratase&lt;/span&gt; &lt;span class="n"&gt;complex&lt;/span&gt;       &lt;span class="n"&gt;CG17654&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000017&lt;/span&gt;      &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;glucoside&lt;/span&gt; &lt;span class="n"&gt;transport&lt;/span&gt;       &lt;span class="n"&gt;CG30035&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The file consists of three tab-delimited columns &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;column 1: the GO category &lt;/li&gt;
&lt;li&gt;column 2: the description of the GO category. Note that spaces are allowed but no tabs &lt;/li&gt;
&lt;li&gt;column 3: a space separated list of gene ids for the given GO category &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; the &lt;code&gt;gene_id&lt;/code&gt;s of the annotation need to be identical to the &lt;code&gt;gene_id&lt;/code&gt;s of the GO association file. The case is not considered as Gowinda internally converts all &lt;code&gt;gene_id&lt;/code&gt;s to lower-case. However a problem may arise as many &lt;code&gt;gene_id&lt;/code&gt;s have synonyms. One strategy to deal with this problem is to use the GO association file from GoMiner (see tutorial &lt;a href="http://code.google.com/p/gowinda/wiki/Tutorial" rel="nofollow"&gt;http://code.google.com/p/gowinda/wiki/Tutorial&lt;/a&gt;) &lt;/p&gt;
&lt;h2 id="total-snp-file"&gt;Total SNP file&lt;/h2&gt;
&lt;p&gt;This file must contain all the SNPs used for the GWAS, in a simple tab-delimited file format. Following you can see the simplest example &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;117081&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;117082&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;144234&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;252591&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;283388&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;318365&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;320282&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;378118&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;378119&lt;/span&gt;
&lt;span class="mi"&gt;2L&lt;/span&gt;  &lt;span class="mi"&gt;476447&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;column 1: the chromosome &lt;/li&gt;
&lt;li&gt;column 2: the position &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Gowinda ignores all additional columns after column 2, thus it is for example also possible to provide a '.mpileup' file (&lt;a href="http://samtools.sourceforge.net"&gt;http://samtools.sourceforge.net/&lt;/a&gt;) as shown in the followinge example &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;  &lt;span class="mi"&gt;2299&lt;/span&gt;    &lt;span class="n"&gt;N&lt;/span&gt;   &lt;span class="mi"&gt;4&lt;/span&gt;   &lt;span class="n"&gt;TTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;  &lt;span class="n"&gt;AAAA&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt;   &lt;span class="n"&gt;TTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;  &lt;span class="n"&gt;AAAA&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;  &lt;span class="mi"&gt;2300&lt;/span&gt;    &lt;span class="n"&gt;N&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;   &lt;span class="n"&gt;AAAA&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FA&lt;/span&gt; &lt;span class="n"&gt;AAAAA&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;   &lt;span class="n"&gt;AAAA&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FA&lt;/span&gt; &lt;span class="n"&gt;AAAAA&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;  &lt;span class="mi"&gt;2301&lt;/span&gt;    &lt;span class="n"&gt;N&lt;/span&gt;   &lt;span class="mi"&gt;6&lt;/span&gt;   &lt;span class="n"&gt;TTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;    &lt;span class="n"&gt;AAAAAA&lt;/span&gt;  &lt;span class="mi"&gt;6&lt;/span&gt;   &lt;span class="n"&gt;TTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;    &lt;span class="n"&gt;AAAAAA&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;  &lt;span class="mi"&gt;2302&lt;/span&gt;    &lt;span class="n"&gt;N&lt;/span&gt;   &lt;span class="mi"&gt;7&lt;/span&gt;   &lt;span class="n"&gt;TTTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;   &lt;span class="n"&gt;AAAAAAA&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;   &lt;span class="n"&gt;TTTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;   &lt;span class="n"&gt;AAAAAAA&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;  &lt;span class="mi"&gt;2303&lt;/span&gt;    &lt;span class="n"&gt;N&lt;/span&gt;   &lt;span class="mi"&gt;8&lt;/span&gt;   &lt;span class="n"&gt;TTTTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;  &lt;span class="n"&gt;AAAAAAAA&lt;/span&gt;    &lt;span class="mi"&gt;8&lt;/span&gt;   &lt;span class="n"&gt;TTTTTTT&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="n"&gt;FT&lt;/span&gt;  &lt;span class="n"&gt;AAAAAAAA&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Gowinda also ignores entries starting with a '#' thus a '.vcf' file (&lt;a href="http://vcftools.sourceforge.net/specs.html"&gt;http://vcftools.sourceforge.net/specs.html&lt;/a&gt;) may also be provided as shown in the following example: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="cp"&gt;##fileformat=VCFv4.0&lt;/span&gt;
&lt;span class="cp"&gt;##fileDate=20090805&lt;/span&gt;
&lt;span class="cp"&gt;##...&lt;/span&gt;
&lt;span class="cp"&gt;##...&lt;/span&gt;
&lt;span class="cp"&gt;#CHROM POS     ID        REF ALT    QUAL FILTER INFO                              FORMAT      NA00001        NA00002        NA00003&lt;/span&gt;
&lt;span class="mi"&gt;20&lt;/span&gt;     &lt;span class="mi"&gt;14370&lt;/span&gt;   &lt;span class="n"&gt;rs6054257&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;      &lt;span class="n"&gt;A&lt;/span&gt;       &lt;span class="mi"&gt;29&lt;/span&gt;   &lt;span class="n"&gt;PASS&lt;/span&gt;   &lt;span class="n"&gt;NS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;AF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;DB&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;H2&lt;/span&gt;           &lt;span class="n"&gt;GT&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;GQ&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;HQ&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;51&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;51&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;51&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;51&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;43&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.,.&lt;/span&gt;
&lt;span class="mi"&gt;20&lt;/span&gt;     &lt;span class="mi"&gt;17330&lt;/span&gt;   &lt;span class="p"&gt;.&lt;/span&gt;         &lt;span class="n"&gt;T&lt;/span&gt;      &lt;span class="n"&gt;A&lt;/span&gt;       &lt;span class="mi"&gt;3&lt;/span&gt;    &lt;span class="n"&gt;q10&lt;/span&gt;    &lt;span class="n"&gt;NS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;AF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.017&lt;/span&gt;               &lt;span class="n"&gt;GT&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;GQ&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;HQ&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;49&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;58&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;   &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;41&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="mi"&gt;20&lt;/span&gt;     &lt;span class="mi"&gt;1110696&lt;/span&gt; &lt;span class="n"&gt;rs6040355&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;      &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;     &lt;span class="mi"&gt;67&lt;/span&gt;   &lt;span class="n"&gt;PASS&lt;/span&gt;   &lt;span class="n"&gt;NS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;AF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.333&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;0.667&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;AA&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="n"&gt;DB&lt;/span&gt; &lt;span class="n"&gt;GT&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;GQ&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;DP&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;HQ&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;27&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="candidate-snp-file"&gt;Candidate SNP file&lt;/h2&gt;
&lt;p&gt;the same applies as for the total SNP file, except that the candidate SNPs must be a subset of the total SNPs. &lt;/p&gt;
&lt;h1 id="gowinda"&gt;Gowinda&lt;/h1&gt;
&lt;p&gt;For an example of how to use Gowinda with a sample dataset please see the tutorial: &lt;a href="http://code.google.com/p/gowinda/wiki/Tutorial" rel="nofollow"&gt;http://code.google.com/p/gowinda/wiki/Tutorial&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="parameters"&gt;Parameters&lt;/h2&gt;
&lt;p&gt;Gowinda has the following input parameters &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;--annotation-file&lt;/code&gt;: a file containing the annotation for the species of interest. Only the &lt;code&gt;.gtf&lt;/code&gt; format is accepted (see above). Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--gene-set-file&lt;/code&gt;: a file containing for every gene set (e.g.: Gene Ontology term) the associated genes; For the file format see above; Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--snp-file&lt;/code&gt;: a file containing the total set of SNPs that were used for the GWAS. For the file format see above; Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--candidate-snp-file&lt;/code&gt;: a file containing the candidate SNPs that show some association with the trait of interest. For the file format see above; Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--output-file&lt;/code&gt;: where to store the output. Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--mode&lt;/code&gt;: As a major feature Gowinda offers two main analysis modes either &lt;code&gt;snp&lt;/code&gt; or &lt;code&gt;gene&lt;/code&gt;; For description see below; Optional parameter; default=&lt;code&gt;gene&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--gene-definition&lt;/code&gt;: As another major feature Gowinda allows to adjust the SNP to gene mapping, i.e.: to decide which genes are associated with a given SNP. For example the user may decide that only SNPs being located in an exon are associated with the corresponding gene. See detailed description below; possible arguments: &lt;code&gt;exon&lt;/code&gt;, &lt;code&gt;cds&lt;/code&gt;, &lt;code&gt;utr&lt;/code&gt;,&lt;code&gt;gene&lt;/code&gt;, &lt;code&gt;upstreamDDDD&lt;/code&gt;, &lt;code&gt;downstreamDDDD&lt;/code&gt;, &lt;code&gt;updownstreamDDDD&lt;/code&gt;; Mandatory parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--simulations&lt;/code&gt;: the number of simulations that should be performed. For more information on the number of simulation see below &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--min-genes&lt;/code&gt;: filter for GO categories having at least &lt;code&gt;--min-genes&lt;/code&gt; number of genes; This parameter is for example useful to remove small GO categories having only one associated gene; Optional parameter; default=1 &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--min-significance&lt;/code&gt;: only report GO categories having after FDR correction &lt;code&gt;p-value &amp;lt;= --min-significance&lt;/code&gt;; Optional parameter; default=1.0 &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--detailed-log&lt;/code&gt;: switch to the detailed log mode. The IDs of genes present in the GO association file but not present in the annotation will be displayed. Also the progress of the simulations will be shown in steps of 10.000; Optional parameter &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--threads&lt;/code&gt;: the simulations of Gowinda utilize multi-threading. Adjust the number of threads to use. Optional parameter; default=1 &lt;/li&gt;
&lt;li&gt;&lt;code&gt;--help&lt;/code&gt;: show the help for Gowinda; Optional parameter &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="snp-to-gene-mapping"&gt;SNP to gene mapping&lt;/h3&gt;
&lt;p&gt;SNP to gene mapping refers to the assignment of genes to SNPs. For example it is possible to associate only genes with a SNP, when an exon is overlapping with the SNP. As another example - when regulatory regions should be considered - a SNP may be considered as associated with a gene when it's position is less than 500 bp upstream of the gene. Gowinda only loads the &lt;code&gt;.gtf&lt;/code&gt; features &lt;code&gt;exon&lt;/code&gt; and &lt;code&gt;cds&lt;/code&gt;, all other features are ignored. The following features are thus internally computed from the features &lt;code&gt;exon&lt;/code&gt; and &lt;code&gt;cds&lt;/code&gt;: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;exon&lt;/code&gt;: SNPs within exons are associated with genes &lt;/li&gt;
&lt;li&gt;&lt;code&gt;cds&lt;/code&gt;: SNPs within CDS are associated with genes &lt;/li&gt;
&lt;li&gt;&lt;code&gt;utr&lt;/code&gt;: SNPs within 5'-UTR and 3'-UTR are associated with genes. Caluclated as &lt;code&gt;exon - cds&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gene&lt;/code&gt;: SNPs within exons or introns are associated with genes. Internally the distance from the start position of the first exon to the end position of the last exon is computed &lt;/li&gt;
&lt;li&gt;&lt;code&gt;upstreamDDDD&lt;/code&gt;: in addition to exons and introns also the DDDD bases upstream the start of a gene are considered for mapping a SNP to a gene. DDDD must be replaced with an arbitrary number. This method requires the strand information. &lt;/li&gt;
&lt;li&gt;&lt;code&gt;downstreamDDDD&lt;/code&gt;: in addition to exons and introns also the DDDD bases downstream of the end of the gene are considered for mapping a SNP to a gene. DDDD must be replaced with an arbitrary number. This method requires the strand information. &lt;/li&gt;
&lt;li&gt;&lt;code&gt;updownstreamDDDD&lt;/code&gt;: in addition to exons and introns also the DDDD bases upstream the start of a gene and the DDDD bases downstream of the end of a gene are considered for mapping a SNP to a gene. DDDD must be replaced with an arbitrary number. This method does NOT require strand information. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; We found examples (in the annotation of &lt;em&gt;D. melanogaster&lt;/em&gt;) where different exons from the same gene are located on opposite strands, in this case Gowinda uses the majority vote, i.e.: the strand being supported by the most exons &lt;/p&gt;
&lt;h3 id="simulations"&gt;simulations&lt;/h3&gt;
&lt;p&gt;The number of simulations has a direct influence on the minimum achievable p-value; For example with 1.000.000 simulations the minimum achievable p-value is 1.0e-6; Increase the number of simulations if you need a higher accuracy. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; increasing the number of simulations will not result in higher numbers of significant GO categories! It will just increase the accuracy. &lt;/p&gt;
&lt;p&gt;We recommend to start with 100.000 simulations, this should take about 4 minutes (with 8 CPUs). When a higher accuracy is required the number of simulations may be increased to 1 million (~30 minutes) or 10 million (~5h). &lt;/p&gt;
&lt;h2 id="algorithm"&gt;Algorithm&lt;/h2&gt;
&lt;p&gt;Gowinda does not reproduce the exact pattern of linkage disequilibrium (LD) between SNPs but offers two complementary test strategies making two extreme assumptions about LD. The first strategy (&lt;code&gt;--mode snp&lt;/code&gt;) assumes linkage equilibrim between SNPs and the second strategy (&lt;code&gt;--mode gene&lt;/code&gt;) assumes complete linkage disequilibrium between SNPs within a gene, thus basically treating every gene as a large haplotype. &lt;/p&gt;
&lt;h3 id="analysis-mode-snp"&gt;Analysis mode SNP&lt;/h3&gt;
&lt;p&gt;This strategy is based on the extreme assumption that all SNPs are independent, i.e.: in linkage equilibrium. In this mode of operation, Gowinda randomly samples the same number of SNPs as candidate SNPs. Subsequently the genes corresponding to the sampled SNPs are identified. Finally the significance of enrichment is estimated from the empirical null distribution and FDR is calculated (see below). The number of SNPs is thus constant (=number of candidate SNPs) in each simulation. However the number of genes associated with the randomly sampled SNPs may vary between the simulations. Furthermore, when using this strategy, genes will be counted multiple times according to the number of candidate/random SNPs. This is based on the rationale that - when SNPs are in linkage equilibrium - every candidate SNP constitutes an independent observation. However any remaining linkage between variants will cause a bias in this strategy, that is the significance of enrichment will be overestimated. &lt;/p&gt;
&lt;h3 id="analysis-mode-gene"&gt;Analysis mode gene&lt;/h3&gt;
&lt;p&gt;This strategy is based on the assumption that all SNPs within a gene are completely linked. In this mode of operation, Gowinda first computes the number of genes that are corresponding to the candidate SNPs (=candidate genes). Subsequently, Gowinda randomly samples SNPs until the corresponding number of genes equals the number of candidate genes. Finally the significance of overrepresentation is estimated from the empirical null distribution and FDR is calculated (see below). In this strategy the number of randomly sampled genes is constant in each simulation whereas the number of randomly sampled SNPs may vary. This strategy basically treats genes as large haplotype blocks. However, if SNPs are not in complete linkage this analysis strategy may result in underestimating the significance of enrichment. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: This mode assumes complete linkage of SNPs within genes and does not account for LD between genes. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: It is however possible to extent the size of the haplotype block of genes by using the option &lt;code&gt;--gene-definition updownstream&lt;/code&gt; (see above). &lt;/p&gt;
&lt;p&gt;We recommend this mode per default. &lt;/p&gt;
&lt;h3 id="calculate-the-p-value-of-gene-set-enrichment"&gt;Calculate the p-value of gene set enrichment&lt;/h3&gt;
&lt;p&gt;After the simulations have been completed, Gowinda derives an empirical null distribution of gene abundance for every gene set. An example of such an empirical null distribution for a single gene set and 1 million simulations may look like in the following example: &lt;/p&gt;
&lt;p&gt;&lt;img alt="" src="http://gowinda.googlecode.com/files/gowinda_edist.png" rel="nofollow" /&gt;&lt;/p&gt;
&lt;p&gt;Let the number of simulations be &lt;code&gt;S&lt;/code&gt;, the gene set for which the p-value should be calculated be &lt;code&gt;g&lt;/code&gt;, the number of genes found for the given gene set within a single simulation be cgs and the number of candidate genes be cgcand than the p-value of enrichment (pg) for the given gene set can be calculated according to the following equation: &lt;/p&gt;
&lt;p&gt;&lt;img alt="" src="http://gowinda.googlecode.com/files/equation_pvalue.png" rel="nofollow" /&gt;&lt;/p&gt;
&lt;p&gt;In the example shown above cgcand would be 18 genes, &lt;code&gt;S&lt;/code&gt; would be 1 million simulations and the sum of cgs larger or equal than 18 (cgcand) about 47.000. Thus the p-value pg for &lt;code&gt;intracellular protein transport&lt;/code&gt; would be 47.000/1.000.000 = 0.047 &lt;/p&gt;
&lt;p&gt;This p-value is calculated for every gene set. &lt;/p&gt;
&lt;h3 id="fdr-correction"&gt;FDR correction&lt;/h3&gt;
&lt;p&gt;We used the empirical FDR correction described in 'Elements of Statistical Learning' (2009): Trevor Hastie, Robert Tibshirani and Jerome Friedman, 2nd edition, pp687-690, (&lt;a href="http://www-stat.stanford.edu/~tibs/ElemStatLearn" rel="nofollow"&gt;http://www-stat.stanford.edu/~tibs/ElemStatLearn/&lt;/a&gt;) &lt;/p&gt;
&lt;p&gt;In detail, Gowinda implements the following algorithm for calculating the FDR: Let &lt;code&gt;G&lt;/code&gt; be the number of gene sets and &lt;code&gt;S&lt;/code&gt; the number of simulations. Than an observed (Robs) and an expected (Rexp) count of gene sets having p-values smaller or equal than a threshold (&lt;code&gt;P&lt;/code&gt; = the p-value which should be FDR corrected) can be computed. The FDR can subsequently simply be calculated by dividing Rexp by Robs. &lt;/p&gt;
&lt;p&gt;&lt;img alt="" src="http://gowinda.googlecode.com/files/fdr_equations.png" rel="nofollow" /&gt;&lt;/p&gt;
&lt;h2 id="output"&gt;Output&lt;/h2&gt;
&lt;p&gt;Following an example output: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0045155&lt;/span&gt;      &lt;span class="mf"&gt;0.050&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mf"&gt;0.0004800000&lt;/span&gt;    &lt;span class="mf"&gt;0.1648850000&lt;/span&gt;    &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="n"&gt;electron&lt;/span&gt; &lt;span class="n"&gt;transporter&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transferring&lt;/span&gt; &lt;span class="n"&gt;electrons&lt;/span&gt; &lt;span class="n"&gt;from&lt;/span&gt; &lt;span class="n"&gt;CoQH2cytochrome&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;reductase&lt;/span&gt; &lt;span class="n"&gt;complex&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt; &lt;span class="n"&gt;cytochrome&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;oxidase&lt;/span&gt; &lt;span class="n"&gt;complex&lt;/span&gt; &lt;span class="n"&gt;activity&lt;/span&gt;        &lt;span class="n"&gt;cg13263&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17903&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0006119&lt;/span&gt;      &lt;span class="mf"&gt;0.050&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mf"&gt;0.0004800000&lt;/span&gt;    &lt;span class="mf"&gt;0.1648850000&lt;/span&gt;    &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="n"&gt;oxidative&lt;/span&gt; &lt;span class="n"&gt;phosphorylation&lt;/span&gt;       &lt;span class="n"&gt;cg13263&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17903&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0009112&lt;/span&gt;      &lt;span class="mf"&gt;0.066&lt;/span&gt;   &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mf"&gt;0.0009500000&lt;/span&gt;    &lt;span class="mf"&gt;0.2321100000&lt;/span&gt;    &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;2&lt;/span&gt;       &lt;span class="mi"&gt;16&lt;/span&gt;      &lt;span class="n"&gt;nucleobase&lt;/span&gt; &lt;span class="n"&gt;metabolic&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt;    &lt;span class="n"&gt;cg7811&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg7171&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0006725&lt;/span&gt;      &lt;span class="mf"&gt;0.308&lt;/span&gt;   &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mf"&gt;0.0027700000&lt;/span&gt;    &lt;span class="mf"&gt;0.3813620000&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mi"&gt;11&lt;/span&gt;      &lt;span class="mi"&gt;73&lt;/span&gt;      &lt;span class="n"&gt;cellular&lt;/span&gt; &lt;span class="n"&gt;aromatic&lt;/span&gt; &lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="n"&gt;metabolic&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt;    &lt;span class="n"&gt;cg7811&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg7171&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10501&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0005759&lt;/span&gt;      &lt;span class="mf"&gt;0.348&lt;/span&gt;   &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mf"&gt;0.0031100000&lt;/span&gt;    &lt;span class="mf"&gt;0.3813620000&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mi"&gt;55&lt;/span&gt;      &lt;span class="n"&gt;mitochondrial&lt;/span&gt; &lt;span class="n"&gt;matrix&lt;/span&gt;    &lt;span class="n"&gt;cg13263&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg7235&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17903&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;column 1: the GO term &lt;/li&gt;
&lt;li&gt;column 2: on the average this number of genes are found per simulation for the given GO category. In &lt;code&gt;--mode gene&lt;/code&gt; every gene is only counted once whereas in &lt;code&gt;--mode snp&lt;/code&gt; a single gene may be counted several times dependent on the SNP &lt;/li&gt;
&lt;li&gt;column 3: using the candidate SNPs this number of genes was found for the given GO category. In &lt;code&gt;--mode gene&lt;/code&gt; every gene is only counted once whereas in &lt;code&gt;--mode snp&lt;/code&gt; a single gene may be counted several times dependent on the SNP &lt;/li&gt;
&lt;li&gt;column 4: p-value (uncorrected for multiple testing) &lt;/li&gt;
&lt;li&gt;column 5: FDR (p-value after adjustment for multiple testing) &lt;/li&gt;
&lt;li&gt;column 6: the number of genes (uniq) found for the given GO category &lt;/li&gt;
&lt;li&gt;column 7: the number of genes that could at most be found for the given GO category, i.e.: genes of the given GO category that have an corresponding entry in the annotation file and contain at least one SNP &lt;/li&gt;
&lt;li&gt;column 8: total number of genes for the given GO category in the GO association file &lt;/li&gt;
&lt;li&gt;column 9: description of the given GO term &lt;/li&gt;
&lt;li&gt;column 10: comma separated list of the gene_ids found for the given GO category &lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="scripts"&gt;Scripts&lt;/h1&gt;
&lt;h2 id="gff2gtfpy"&gt;Gff2Gtf.py&lt;/h2&gt;
&lt;p&gt;The script &lt;code&gt;Gff2Gtf.py&lt;/code&gt; may be used to convert a &lt;code&gt;.gff&lt;/code&gt; file into a &lt;code&gt;.gtf&lt;/code&gt; file. The script scans the &lt;code&gt;.gff&lt;/code&gt; file for the features exon, cds, and mRNA all other features are ignored. These features require the fields 'ID=' and 'Parent='. The ID of the corresponding gene will be infered for every exon/cds from the hierarchy exon-&amp;gt;mRNA-&amp;gt;gene. Exons not having a parent mRNA will be ignored. Furthermore if exons have several parent mRNAs only the first one will be used. &lt;/p&gt;
&lt;p&gt;Following are the parameters: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;--input: the path to the gtf file &lt;/li&gt;
&lt;li&gt;--help: Display help for the script &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Redirect the output (&lt;code&gt;.gtf&lt;/code&gt;) to any file of your choice (see tutorial) &lt;/p&gt;
&lt;h2 id="gominer2funcassociatepy"&gt;Gominer2FuncAssociate.py&lt;/h2&gt;
&lt;p&gt;This script converts a &lt;code&gt;.gce&lt;/code&gt; file from GoMiner to a GO association file compatible with Gowinda and FuncAssociate. See the tutorial for how to obtain a &lt;code&gt;.gce&lt;/code&gt; file and how to use this script. &lt;/p&gt;
&lt;p&gt;Following a small example from a &lt;code&gt;.gce&lt;/code&gt; file: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0033033&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0250842&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0260934&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0027538&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0015584&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0023172&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0016053&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0011236&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0005655&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0000003&lt;/span&gt;&lt;span class="n"&gt;_reproduction&lt;/span&gt; &lt;span class="n"&gt;FBGN0034430&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mi"&gt;137&lt;/span&gt; &lt;span class="mf"&gt;1.000000&lt;/span&gt;    &lt;span class="mf"&gt;0.000000&lt;/span&gt;    &lt;span class="mi"&gt;4633&lt;/span&gt;    &lt;span class="mf"&gt;4633.0&lt;/span&gt;  &lt;span class="mf"&gt;1.000000&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Following an example of the resulting GO assocation file &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0030381&lt;/span&gt;      &lt;span class="n"&gt;chorion&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;containing_eggshell_pattern_formation&lt;/span&gt;   &lt;span class="n"&gt;FBGN0003731&lt;/span&gt; &lt;span class="n"&gt;FBGN0015754&lt;/span&gt; &lt;span class="n"&gt;FBGN0004400&lt;/span&gt; &lt;span class="n"&gt;FBGN0003733&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0045887&lt;/span&gt;      &lt;span class="n"&gt;positive_regulation_of_synaptic_growth_at_neuromuscular_junction&lt;/span&gt;        &lt;span class="n"&gt;FBGN0003317&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0005795&lt;/span&gt;      &lt;span class="n"&gt;Golgi_stack&lt;/span&gt;     &lt;span class="n"&gt;FBGN0033500&lt;/span&gt; &lt;span class="n"&gt;FBGN0027558&lt;/span&gt; &lt;span class="n"&gt;FBGN0034697&lt;/span&gt; &lt;span class="n"&gt;FBGN0034025&lt;/span&gt; &lt;span class="n"&gt;FBGN0033339&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0034660&lt;/span&gt;      &lt;span class="n"&gt;ncRNA_metabolic_process&lt;/span&gt; &lt;span class="n"&gt;FBGN0034401&lt;/span&gt; &lt;span class="n"&gt;FBGN0028426&lt;/span&gt; &lt;span class="n"&gt;FBGN0034177&lt;/span&gt; &lt;span class="n"&gt;FBGN0003062&lt;/span&gt; &lt;span class="n"&gt;FBGN0034735&lt;/span&gt; &lt;span class="n"&gt;FBGN0026722&lt;/span&gt; &lt;span class="n"&gt;FBGN0033741&lt;/span&gt; &lt;span class="n"&gt;FBGN0002069&lt;/span&gt; &lt;span class="n"&gt;FBGN0035064&lt;/span&gt; &lt;span class="n"&gt;FBGN0262560&lt;/span&gt; &lt;span class="n"&gt;FBGN0033454&lt;/span&gt; 
&lt;span class="n"&gt;FBGN0053505&lt;/span&gt; &lt;span class="n"&gt;FBGN0033375&lt;/span&gt; &lt;span class="n"&gt;FBGN0020766&lt;/span&gt; &lt;span class="n"&gt;FBGN0033900&lt;/span&gt; &lt;span class="n"&gt;FBGN0027091&lt;/span&gt; &lt;span class="n"&gt;FBGN0027079&lt;/span&gt; &lt;span class="n"&gt;FBGN0028744&lt;/span&gt; &lt;span class="n"&gt;FBGN0259937&lt;/span&gt; &lt;span class="n"&gt;FBGN0011824&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0042826&lt;/span&gt;      &lt;span class="n"&gt;histone_deacetylase_binding&lt;/span&gt;     &lt;span class="n"&gt;FBGN0033748&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0043139&lt;/span&gt;      &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="s1"&gt;'-3'&lt;/span&gt;&lt;span class="n"&gt;_DNA_helicase_activity&lt;/span&gt;     &lt;span class="n"&gt;FBGN0261850&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The parameters of this script are: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;--input: The &lt;code&gt;.gce&lt;/code&gt; file which should be converted &lt;/li&gt;
&lt;li&gt;--help: Display help for the script &lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="use-a-custom-pathway-database"&gt;Use a custom pathway database&lt;/h1&gt;
&lt;p&gt;Although Gowinda will probably mostly be used to test for overrepresentation of genes in GO categories, it is also possible to use different databases. In this case it is only necessary to provide a custom 'gene set' file like in the following example: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;category_A&lt;/span&gt;    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;    &lt;span class="n"&gt;gene1&lt;/span&gt; &lt;span class="n"&gt;gene2&lt;/span&gt; &lt;span class="n"&gt;gene3&lt;/span&gt; &lt;span class="n"&gt;gene4&lt;/span&gt; &lt;span class="n"&gt;gene5&lt;/span&gt;
&lt;span class="n"&gt;category_B&lt;/span&gt;    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;    &lt;span class="n"&gt;gene1&lt;/span&gt; &lt;span class="n"&gt;gene6&lt;/span&gt; &lt;span class="n"&gt;gene7&lt;/span&gt; &lt;span class="n"&gt;gene8&lt;/span&gt;
&lt;span class="n"&gt;category_C&lt;/span&gt;    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;    &lt;span class="n"&gt;gene10&lt;/span&gt; &lt;span class="n"&gt;gene11&lt;/span&gt; &lt;span class="n"&gt;gene12&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;column1: name of the custom category &lt;/li&gt;
&lt;li&gt;column2: description of the custom categories &lt;/li&gt;
&lt;li&gt;column3: list of genes associated with the custom category &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; that columns are separated by a tab whereas entries within columns need to be separated by a space! &lt;/p&gt;
&lt;h1 id="links"&gt;Links&lt;/h1&gt;
&lt;p&gt;Several tools have been developed for the analysis of pathway enrichement in GWA. For an excellent review see Wang et. al. (2010) "Analysing biological pathways in genome-wide association studies" (&lt;a href="http://www.nature.com/nrg/journal/v11/n12/abs/nrg2884.html" rel="nofollow"&gt;http://www.nature.com/nrg/journal/v11/n12/abs/nrg2884.html&lt;/a&gt;) &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ALIGATOR &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/19539887" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/19539887&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;i-GSEA4GWAS &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/20435672" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/20435672&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GenGen &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/17966091" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/17966091&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GeSBAP &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/19502494" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/19502494/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GRASS &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/20560206" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/20560206&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GSA-SNP &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/20501604" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/20501604&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GSEA-SNP &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/18854360" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/18854360&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;PLINK set-test &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/17701901" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/17701901&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;SNP ratio test &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/19620097" rel="nofollow"&gt;http://www.ncbi.nlm.nih.gov/pubmed/19620097&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Wed, 18 Mar 2015 14:56:32 -0000</pubDate><guid>https://sourceforge.net5d397b63e7d7933e28911893383c5c51215ae917</guid></item></channel></rss>