<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to Validation</title><link>https://sourceforge.net/p/gowinda/wiki/Validation/</link><description>Recent changes to Validation</description><atom:link href="https://sourceforge.net/p/gowinda/wiki/Validation/feed" rel="self"/><language>en</language><lastBuildDate>Sun, 27 Aug 2017 17:15:43 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/gowinda/wiki/Validation/feed" rel="self" type="application/rss+xml"/><item><title>Validation modified by Robert Kofler</title><link>https://sourceforge.net/p/gowinda/wiki/Validation/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v2
+++ v3
@@ -96,7 +96,8 @@

 With an FDR of &amp;lt;0.05 neither GoMiner nor Gowinda found a significant overrepresentation for any GO categories. The p-values (not FDR corrected) for the different GO categories obtained by Gowinda and GoMiner are highly correlated (Spearman's rank correlation; rho=0.9999005; p-value &amp;lt; 2.2e-16), see graph below: 

-![](http://gowinda.googlecode.com/files/cor_gowinda_gominer.png)
+[[img src=cor_gowinda_gominer.png]]
+

 Our results (including our random SNPs) for the comparision of Gowinda with GoMiner can be found here: &amp;lt;http: gowinda.googlecode.com="" files="" step1_valGoMiner.zip=""&amp;gt;

@@ -159,7 +160,7 @@

 Although we randomly drew SNPs and would thus not expect an overrepresentation of any GO term, GoMiner reports a significant overrepresentation for 341 GO terms (FDR &amp;lt;0.05). This demonstrates that the biases of GWA datasets substantially affect GO analysis. By contrast, Gowinda correctly reports that none of the GO terms show any significant overrepresentation (FDR &amp;lt;0.05). The correlation between the p-values obtained with Gowinda and GoMiner (Spearman's rank correlation; rho=0.5610065; p-value &amp;lt; 2.2e-16) is much worse as compared to the unbiased data set. The following graph shows the correlation of the p-values obtained with Gowinda and GoMiner using the length biased dataset. 

-[![](http://gowinda.googlecode.com/files/corBiased_Gominer_Gowinda.png)] 
+[[img src=corBiased_Gominer_Gowinda.png]]

 Our results of this analysis can be found here: &amp;lt;http: gowinda.googlecode.com="" files="" step2_lengthBias.zip=""&amp;gt;

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Robert Kofler</dc:creator><pubDate>Sun, 27 Aug 2017 17:15:43 -0000</pubDate><guid>https://sourceforge.net0dc87a238c9358aac7022f274ea977444be5fcce</guid></item><item><title>Validation modified by Robert Kofler</title><link>https://sourceforge.net/p/gowinda/wiki/Validation/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -1,29 +1,4 @@
-  * Introduction
-    * Files
-      * Results for validating Gowinda
-  * Validating Gowinda by comparing the results to GoMiner
-    * Obtain a set of genes that are not overlapping and have an associated GO term
-    * Introduce five SNPs into each of the non-overlapping genes
-    * Choose 1000 random SNPs
-    * Perform GO term analysis with Gowinda
-    * Perform GO term analysis with GoMiner
-    * Merge the results of Gowinda and GoMiner
-    * Results for unbiased analysis
-  * Validating whether Gowinda eliminates the gene length bias
-    * Extract the genes having an associated GO term
-    * Introduce one SNP every 100 bp
-    * Randomly draw 1000 SNPs
-    * Perform GO term analysis with Gowinda
-    * Perform GO term analysis with GoMiner
-    * Merge the results of Gowinda and GoMiner
-    * Results for length biased analysis
-  * Validating whether Gowinda identifies preselected GO categories
-    * Identify genes having an associated GO term
-    * Introduce one SNP every 100bp for genes with an associated GO term
-    * Randomly pick five small GO categories
-    * Create 1000 candidate SNPs
-    * Perform GO term analysis with Gowinda
-    * Results for recovery of preselected GO categories
+[TOC]

 # Introduction

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Robert Kofler</dc:creator><pubDate>Sun, 27 Aug 2017 17:13:24 -0000</pubDate><guid>https://sourceforge.net76ae8b39c649c3c8b3bb4b67f8e41b199aca39a8</guid></item><item><title>Validation modified by Anonymous</title><link>https://sourceforge.net/p/gowinda/wiki/Validation/</link><description>&lt;div class="markdown_content"&gt;&lt;ul&gt;
&lt;li&gt;Introduction&lt;ul&gt;
&lt;li&gt;Files&lt;/li&gt;
&lt;li&gt;Results for validating Gowinda&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Validating Gowinda by comparing the results to GoMiner&lt;ul&gt;
&lt;li&gt;Obtain a set of genes that are not overlapping and have an associated GO term&lt;/li&gt;
&lt;li&gt;Introduce five SNPs into each of the non-overlapping genes&lt;/li&gt;
&lt;li&gt;Choose 1000 random SNPs&lt;/li&gt;
&lt;li&gt;Perform GO term analysis with Gowinda&lt;/li&gt;
&lt;li&gt;Perform GO term analysis with GoMiner&lt;/li&gt;
&lt;li&gt;Merge the results of Gowinda and GoMiner&lt;/li&gt;
&lt;li&gt;Results for unbiased analysis&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Validating whether Gowinda eliminates the gene length bias&lt;ul&gt;
&lt;li&gt;Extract the genes having an associated GO term&lt;/li&gt;
&lt;li&gt;Introduce one SNP every 100 bp&lt;/li&gt;
&lt;li&gt;Randomly draw 1000 SNPs&lt;/li&gt;
&lt;li&gt;Perform GO term analysis with Gowinda&lt;/li&gt;
&lt;li&gt;Perform GO term analysis with GoMiner&lt;/li&gt;
&lt;li&gt;Merge the results of Gowinda and GoMiner&lt;/li&gt;
&lt;li&gt;Results for length biased analysis&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Validating whether Gowinda identifies preselected GO categories&lt;ul&gt;
&lt;li&gt;Identify genes having an associated GO term&lt;/li&gt;
&lt;li&gt;Introduce one SNP every 100bp for genes with an associated GO term&lt;/li&gt;
&lt;li&gt;Randomly pick five small GO categories&lt;/li&gt;
&lt;li&gt;Create 1000 candidate SNPs&lt;/li&gt;
&lt;li&gt;Perform GO term analysis with Gowinda&lt;/li&gt;
&lt;li&gt;Results for recovery of preselected GO categories&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="introduction"&gt;Introduction&lt;/h1&gt;
&lt;p&gt;We validated Gowinda with the following approaches: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;first we compared the results of Gowinda with GoMiner using an unbiased dataset (no gene length bias and no overlapping genes). We found that the p-values reported by Gowinda and GoMiner are highly correlated (Spearman's rank correlation; rho=0.99). &lt;/li&gt;
&lt;li&gt;second we assessed the impact of the gene length bias and estimated whether Gowinda efficiently deals with the bias. We found that with a biased dataset GoMiner reports an significant enrichment for ~300 GO categories, whereas Gowinda correctly reports none. &lt;/li&gt;
&lt;li&gt;and third we tested whether Gowinda correctly identifies a significant enrichment for 5 preselected GO terms and found that Gowinda indeed correctly identified these preselected GO categories (FDR&amp;lt;0.01). &lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="files"&gt;Files&lt;/h2&gt;
&lt;p&gt;The data and scripts of following file allow to repeat the whole validation of Gowinda: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href="http://gowinda.googlecode.com/files/validation_files.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/validation_files.zip&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This archive contains: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the Python scripts used in the validation &lt;/li&gt;
&lt;li&gt;the annotation of &lt;em&gt;D. melanogaster&lt;/em&gt; (v5.43) &lt;/li&gt;
&lt;li&gt;the GO associations obtained from GoMiner for 'CG' gene_ids &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="results-for-validating-gowinda"&gt;Results for validating Gowinda&lt;/h3&gt;
&lt;p&gt;As our approach for validating Gowinda contains several random drawing steps we also provide the results we obtained for these random procedures, in order to allow an exact reproduction of our results. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;biased dataset: &lt;a href="http://gowinda.googlecode.com/files/step1_valGoMiner.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step1_valGoMiner.zip&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;unbiased dataset: &lt;a href="http://gowinda.googlecode.com/files/step2_lengthBias.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step2_lengthBias.zip&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;known recovery: &lt;a href="http://gowinda.googlecode.com/files/step3_knownRecovery.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step3_knownRecovery.zip&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="validating-gowinda-by-comparing-the-results-to-gominer"&gt;Validating Gowinda by comparing the results to GoMiner&lt;/h1&gt;
&lt;p&gt;As GoMiner does not correct for the gene length bias nor consider overlapping genes it was necessary to use a dataset not showing these two biases. We therefore (i) filtered for not-overlapping genes (ii) filtered for genes that have an associated GO term and (iii) introduced exactly 5 SNPs in each of the thus obtained genes. Subsequently we randomly picked 1000 SNPs and compared the p-values obtained with Gowinda to the p-values obtained with GoMiner (without FDR correction). &lt;/p&gt;
&lt;h2 id="obtain-a-set-of-genes-that-are-not-overlapping-and-have-an-associated-go-term"&gt;Obtain a set of genes that are not overlapping and have an associated GO term&lt;/h2&gt;
&lt;p&gt;First obtain a set of non-overlapping genes (gene IDs) &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;get_nonoverlapping_geneids&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;nonoverlapping_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Than filter this set for genes that have an associated GO term &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;get_geneids_havinggocategory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;go&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;genelist&lt;/span&gt; &lt;span class="n"&gt;nonoverlapping_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;geneids_withgoterm_nooverlap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="introduce-five-snps-into-each-of-the-non-overlapping-genes"&gt;Introduce five SNPs into each of the non-overlapping genes&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;create_snps_for_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;genelist&lt;/span&gt; &lt;span class="n"&gt;geneids_withgoterm_nooverlap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;snps_5pgene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="choose-1000-random-snps"&gt;Choose 1000 random SNPs&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;snps_5pgene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;perl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ne&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mf"&gt;0.03&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rand_snps_1k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="perform-go-term-analysis-with-gowinda"&gt;Perform GO term analysis with Gowinda&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Library&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Frameworks&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;JavaVM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;framework&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Versions&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;1.6.0&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Commands&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;Xmx4g&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;robertkofler&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;PopGenTools&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;gowinda&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;snps_5pgene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;  &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;rand_snps_1k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;annotation&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;simulations&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;significance&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;gowinda_res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt; &lt;span class="n"&gt;gene&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;genes&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="perform-go-term-analysis-with-gominer"&gt;Perform GO term analysis with GoMiner&lt;/h2&gt;
&lt;p&gt;For the analysis with GoMiner we need two lists of genes (gene IDs). The first list with must contain the total set of genes and the second list must contain the subset of genes that should be tested for enrichment. &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;snps_5pgene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;uniq&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;gominer_total&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;rand_snps_1k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;uniq&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;gominer_totest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Subsequently use High-Troughput GoMiner: &lt;a href="http://discover.nci.nih.gov/gominer/GoCommandWebInterface.jsp" rel="nofollow"&gt;http://discover.nci.nih.gov/gominer/GoCommandWebInterface.jsp&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Following the settings that we used with GoMiner &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;totalfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gominer_total&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;changedfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gominer_totest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;datasource&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FB&lt;/span&gt;
&lt;span class="n"&gt;organism&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7227&lt;/span&gt;
&lt;span class="n"&gt;evidencecode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;
&lt;span class="n"&gt;crossref&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;synonym&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;thresholdtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOTH&lt;/span&gt;
&lt;span class="n"&gt;timeseriesthreshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;
&lt;span class="n"&gt;randomization&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;categorymin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;cim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;categorymax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;
&lt;span class="n"&gt;rootcategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;
&lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;xxx&lt;/span&gt;&lt;span class="err"&gt;@&lt;/span&gt;&lt;span class="n"&gt;xxx&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="merge-the-results-of-gowinda-and-gominer"&gt;Merge the results of Gowinda and GoMiner&lt;/h2&gt;
&lt;p&gt;From the GoMiner results you need to extract the file: gominer_totest.txt.change &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;gowinda_res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;k1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merged&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;tm_gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;GoMiner&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;gominer_totest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;NF&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;perl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pe&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;//' |sort -k1,1&amp;gt; merged/tm_gominer.txt&lt;/span&gt;
&lt;span class="n"&gt;join&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;tm_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="n"&gt;tm_gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merged_min_win&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The resulting file contains for every GO category (column 1) the p-value obtained with GoMiner (column 2) and Gowinda (column 3) and may easily be analyzed with R. &lt;/p&gt;
&lt;h2 id="results-for-unbiased-analysis"&gt;Results for unbiased analysis&lt;/h2&gt;
&lt;p&gt;With an FDR of &amp;lt;0.05 neither GoMiner nor Gowinda found a significant overrepresentation for any GO categories. The p-values (not FDR corrected) for the different GO categories obtained by Gowinda and GoMiner are highly correlated (Spearman's rank correlation; rho=0.9999005; p-value &amp;lt; 2.2e-16), see graph below: &lt;/p&gt;
&lt;p&gt;&lt;img alt="" src="http://gowinda.googlecode.com/files/cor_gowinda_gominer.png" rel="nofollow" /&gt;&lt;/p&gt;
&lt;p&gt;Our results (including our random SNPs) for the comparision of Gowinda with GoMiner can be found here: &lt;a href="http://gowinda.googlecode.com/files/step1_valGoMiner.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step1_valGoMiner.zip&lt;/a&gt;&lt;/p&gt;
&lt;h1 id="validating-whether-gowinda-eliminates-the-gene-length-bias"&gt;Validating whether Gowinda eliminates the gene length bias&lt;/h1&gt;
&lt;p&gt;In GWAS studies longer genes typically have more SNPs and thus also have a higher probability of containing a candidate SNP, which could lead to an overrepresentation of GO categories having longer genes. To test the extent of the bias of GWA datasets and whether Gowinda efficiently corrects for this bias, we introduced one SNP every 100bp into all genes (having a GO category). As overlapping genes may be an additional source of bias in GWA datesets, overlapping genes were &lt;strong&gt;not&lt;/strong&gt; discarded in this analysis. Finally we randomly picked 1000 SNPs and compared the results obtained with GoMiner to the results obtained with Gowinda. &lt;/p&gt;
&lt;h2 id="extract-the-genes-having-an-associated-go-term"&gt;Extract the genes having an associated GO term&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;extract_geneids_fromgoaassociation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;go&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;total_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="introduce-one-snp-every-100-bp"&gt;Introduce one SNP every 100 bp&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;create_length_biased_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;genelist&lt;/span&gt; &lt;span class="n"&gt;total_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="randomly-draw-1000-snps"&gt;Randomly draw 1000 SNPs&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;perl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ne&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mf"&gt;0.004&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rand_1k_snps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="perform-go-term-analysis-with-gowinda_1"&gt;Perform GO term analysis with Gowinda&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Library&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Frameworks&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;JavaVM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;framework&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Versions&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;1.6.0&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Commands&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;Xmx4g&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;robertkofler&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;PopGenTools&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;gowinda&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;  &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;rand_1k_snps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;annotation&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;simulations&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;significance&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;gowinda_res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt; &lt;span class="n"&gt;gene&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;genes&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="perform-go-term-analysis-with-gominer_1"&gt;Perform GO term analysis with GoMiner&lt;/h2&gt;
&lt;p&gt;First obtain the gene_ids from the randomly drawn SNPs &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;rand_1k_snps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;uniq&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;subset_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Than perform the analysis for GO term enrichment with GoMiner: &lt;a href="http://discover.nci.nih.gov/gominer/htgm.jsp" rel="nofollow"&gt;http://discover.nci.nih.gov/gominer/htgm.jsp&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;We used the following parameters &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;totalfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;total_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;changedfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subset_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="n"&gt;datasource&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FB&lt;/span&gt;
&lt;span class="n"&gt;organism&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7227&lt;/span&gt;
&lt;span class="n"&gt;evidencecode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;
&lt;span class="n"&gt;crossref&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;synonym&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;thresholdtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOTH&lt;/span&gt;
&lt;span class="n"&gt;timeseriesthreshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;
&lt;span class="n"&gt;randomization&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="n"&gt;categorymin&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;cim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;categorymax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;
&lt;span class="n"&gt;rootcategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;all&lt;/span&gt;
&lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rokofler&lt;/span&gt;&lt;span class="err"&gt;@&lt;/span&gt;&lt;span class="n"&gt;gmail&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="merge-the-results-of-gowinda-and-gominer_1"&gt;Merge the results of Gowinda and GoMiner&lt;/h2&gt;
&lt;p&gt;From the GoMiner results you need to extract the file subset_genes.txt.change &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;gowinda_res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;k1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merged&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;tm_gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; 
&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;GoMiner&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;subset_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;NF&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;perl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pe&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;^&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="c1"&gt;//' |sort -k1,1&amp;gt; merged/tm_gominer.txt&lt;/span&gt;
&lt;span class="n"&gt;join&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;tm_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="n"&gt;tm_gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merged_min_win&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="results-for-length-biased-analysis"&gt;Results for length biased analysis&lt;/h2&gt;
&lt;p&gt;Although we randomly drew SNPs and would thus not expect an overrepresentation of any GO term, GoMiner reports a significant overrepresentation for 341 GO terms (FDR &amp;lt;0.05). This demonstrates that the biases of GWA datasets substantially affect GO analysis. By contrast, Gowinda correctly reports that none of the GO terms show any significant overrepresentation (FDR &amp;lt;0.05). The correlation between the p-values obtained with Gowinda and GoMiner (Spearman's rank correlation; rho=0.5610065; p-value &amp;lt; 2.2e-16) is much worse as compared to the unbiased data set. The following graph shows the correlation of the p-values obtained with Gowinda and GoMiner using the length biased dataset. &lt;/p&gt;
&lt;p&gt;&lt;span&gt;[&lt;img alt="" src="http://gowinda.googlecode.com/files/corBiased_Gominer_Gowinda.png" rel="nofollow" /&gt;]&lt;/span&gt; &lt;/p&gt;
&lt;p&gt;Our results of this analysis can be found here: &lt;a href="http://gowinda.googlecode.com/files/step2_lengthBias.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step2_lengthBias.zip&lt;/a&gt;&lt;/p&gt;
&lt;h1 id="validating-whether-gowinda-identifies-preselected-go-categories"&gt;Validating whether Gowinda identifies preselected GO categories&lt;/h1&gt;
&lt;p&gt;To validate that Gowinda correctly identifies overrepresentation of GO terms, we proceeded in two steps. First we randomly picked five GO categories with a small number of genes (between 5 and 10). We choose small GO categories in order to get conservative estimates of the reliability of Gowinda. We than introduced one candidate SNP for ever gene associated with these preselected GO categories. We added randomly drawn candidate SNPs from a length biased SNP dataset (see above) until a total of 1000 candidate SNPs was obtained. Finally we performed an analysis for gene set enrichment using Gowinda and checked whether the five randomly preselected GO categories correctly showed a significant overrepresentation. &lt;/p&gt;
&lt;h2 id="identify-genes-having-an-associated-go-term"&gt;Identify genes having an associated GO term&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;extract_geneids_fromgoaassociation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;go&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;total_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="introduce-one-snp-every-100bp-for-genes-with-an-associated-go-term"&gt;Introduce one SNP every 100bp for genes with an associated GO term&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;create_length_biased_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;genelist&lt;/span&gt; &lt;span class="n"&gt;total_genes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="randomly-pick-five-small-go-categories"&gt;Randomly pick five small GO categories&lt;/h2&gt;
&lt;p&gt;We randomly picked five small GO categories having 5 - 10 genes. &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pick_random_gocats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;random_gocategories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In our example we picked the following categories, where the first column is the number of genes in this category: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;random_gocategories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;awk&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;print&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;uniq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;
   &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mo"&gt;00072&lt;/span&gt;&lt;span class="mi"&gt;89&lt;/span&gt;
   &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mo"&gt;000&lt;/span&gt;&lt;span class="mi"&gt;9074&lt;/span&gt;
   &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mo"&gt;0030173&lt;/span&gt;
   &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mo"&gt;0040020&lt;/span&gt;
   &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mo"&gt;0046364&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="create-1000-candidate-snps"&gt;Create 1000 candidate SNPs&lt;/h2&gt;
&lt;p&gt;We than introduce candidate SNPs into the genes associated with the preselected GO categories and add randomly drawn candidate SNPs from the biased dataset until a total of 1000 candidate SNPs was obtained &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;python&lt;/span&gt; &lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;scripts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pick_random_snps_excepttargetlist&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;genelist&lt;/span&gt; &lt;span class="n"&gt;random_gocategories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;random_targetsnps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="perform-go-term-analysis-with-gowinda_2"&gt;Perform GO term analysis with Gowinda&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Library&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Frameworks&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;JavaVM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;framework&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Versions&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;1.6.0&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Commands&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;Xmx4g&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Users&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;robertkofler&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;PopGenTools&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;gowinda&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jar&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;snps_lengthbiased&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;candidate&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;snp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;random_targetsnps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;association_gominer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;annotation&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;Flybase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;simulations&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;significance&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gene&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="n"&gt;exon&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="n"&gt;gowinda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt; &lt;span class="n"&gt;gene&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;genes&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id="results-for-recovery-of-preselected-go-categories"&gt;Results for recovery of preselected GO categories&lt;/h2&gt;
&lt;p&gt;With an FDR &amp;lt;0.01 we identified 19 significantly enriched GO categories (see below). All five preselected GO categories were successfully identified (marked with an arrow). However we also found an additional 14 not targeted GO categories, which can be explained by the nesting of GO categories. For example the not preselected GO category GO:0046165 contains the genes cg32220,cg17725,cg8890,cg10688,cg3495,cg10924,cg8251,cg1516 which are all associated with the preselected GO category: GO:0046364 &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0046165&lt;/span&gt;      &lt;span class="mf"&gt;1.278&lt;/span&gt;   &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mf"&gt;0.0000010000&lt;/span&gt;    &lt;span class="mf"&gt;0.0002654000&lt;/span&gt;    &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mi"&gt;14&lt;/span&gt;      &lt;span class="mi"&gt;14&lt;/span&gt;      &lt;span class="n"&gt;alcohol_biosynthetic_process&lt;/span&gt;    &lt;span class="n"&gt;cg32220&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17725&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10118&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10924&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8251&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1516&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0019319&lt;/span&gt;      &lt;span class="mf"&gt;0.714&lt;/span&gt;   &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mf"&gt;0.0000010000&lt;/span&gt;    &lt;span class="mf"&gt;0.0002654000&lt;/span&gt;    &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="n"&gt;hexose_biosynthetic_process&lt;/span&gt;     &lt;span class="n"&gt;cg17725&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10924&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8251&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1516&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0009074&lt;/span&gt;      &lt;span class="mf"&gt;0.339&lt;/span&gt;   &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="mf"&gt;0.0000010000&lt;/span&gt;    &lt;span class="mf"&gt;0.0002654000&lt;/span&gt;    &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="n"&gt;aromatic_amino_acid_family_catabolic_process&lt;/span&gt;    &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1555&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2155&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0046364&lt;/span&gt;      &lt;span class="mf"&gt;0.740&lt;/span&gt;   &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="mf"&gt;0.0000010000&lt;/span&gt;    &lt;span class="mf"&gt;0.0002654000&lt;/span&gt;    &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="n"&gt;monosaccharide_biosynthetic_process&lt;/span&gt;     &lt;span class="n"&gt;cg32220&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17725&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10924&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8251&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1516&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0007289&lt;/span&gt;      &lt;span class="mf"&gt;0.483&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mf"&gt;0.0000010000&lt;/span&gt;    &lt;span class="mf"&gt;0.0002654000&lt;/span&gt;    &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="n"&gt;spermatid_nucleus_differentiation&lt;/span&gt;       &lt;span class="n"&gt;cg3354&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg6998&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg5648&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg12284&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8827&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0009225&lt;/span&gt;      &lt;span class="mf"&gt;0.152&lt;/span&gt;   &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mf"&gt;0.0000020000&lt;/span&gt;    &lt;span class="mf"&gt;0.0003176667&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="n"&gt;nucleotide&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sugar_metabolic_process&lt;/span&gt;      &lt;span class="n"&gt;cg32220&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0009226&lt;/span&gt;      &lt;span class="mf"&gt;0.152&lt;/span&gt;   &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mf"&gt;0.0000020000&lt;/span&gt;    &lt;span class="mf"&gt;0.0003176667&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="n"&gt;nucleotide&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;sugar_biosynthetic_process&lt;/span&gt;   &lt;span class="n"&gt;cg32220&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0019439&lt;/span&gt;      &lt;span class="mf"&gt;0.464&lt;/span&gt;   &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="mf"&gt;0.0000020000&lt;/span&gt;    &lt;span class="mf"&gt;0.0003176667&lt;/span&gt;    &lt;span class="mi"&gt;6&lt;/span&gt;       &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="n"&gt;aromatic_compound_catabolic_process&lt;/span&gt;     &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1555&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2155&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0051445&lt;/span&gt;      &lt;span class="mf"&gt;0.966&lt;/span&gt;   &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mf"&gt;0.0000020000&lt;/span&gt;    &lt;span class="mf"&gt;0.0003176667&lt;/span&gt;    &lt;span class="mi"&gt;7&lt;/span&gt;       &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="n"&gt;regulation_of_meiotic_cell_cycle&lt;/span&gt;        &lt;span class="n"&gt;cg18543&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg6513&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9900&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg13584&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4727&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4336&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg7719&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0030173&lt;/span&gt;      &lt;span class="mf"&gt;0.517&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mf"&gt;0.0000040000&lt;/span&gt;    &lt;span class="mf"&gt;0.0005274545&lt;/span&gt;    &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="n"&gt;integral_to_Golgi_membrane&lt;/span&gt;      &lt;span class="n"&gt;cg10580&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4871&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10772&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2448&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg12366&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0031228&lt;/span&gt;      &lt;span class="mf"&gt;0.517&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mf"&gt;0.0000040000&lt;/span&gt;    &lt;span class="mf"&gt;0.0005274545&lt;/span&gt;    &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="n"&gt;intrinsic_to_Golgi_membrane&lt;/span&gt;     &lt;span class="n"&gt;cg10580&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4871&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10772&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2448&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg12366&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0006558&lt;/span&gt;      &lt;span class="mf"&gt;0.227&lt;/span&gt;   &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mf"&gt;0.0000070000&lt;/span&gt;    &lt;span class="mf"&gt;0.0008063077&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="n"&gt;L&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;phenylalanine_metabolic_process&lt;/span&gt;       &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0006559&lt;/span&gt;      &lt;span class="mf"&gt;0.227&lt;/span&gt;   &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mf"&gt;0.0000070000&lt;/span&gt;    &lt;span class="mf"&gt;0.0008063077&lt;/span&gt;    &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="mi"&gt;4&lt;/span&gt;       &lt;span class="n"&gt;L&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;phenylalanine_catabolic_process&lt;/span&gt;       &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0040020&lt;/span&gt;      &lt;span class="mf"&gt;0.622&lt;/span&gt;   &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mf"&gt;0.0000110000&lt;/span&gt;    &lt;span class="mf"&gt;0.0011997857&lt;/span&gt;    &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="mi"&gt;5&lt;/span&gt;       &lt;span class="n"&gt;regulation_of_meiosis&lt;/span&gt;   &lt;span class="n"&gt;cg9900&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg13584&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4727&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4336&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg7719&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0009072&lt;/span&gt;      &lt;span class="mf"&gt;1.411&lt;/span&gt;   &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="mf"&gt;0.0000230000&lt;/span&gt;    &lt;span class="mf"&gt;0.0024266667&lt;/span&gt;    &lt;span class="mi"&gt;8&lt;/span&gt;       &lt;span class="mi"&gt;18&lt;/span&gt;      &lt;span class="mi"&gt;18&lt;/span&gt;      &lt;span class="n"&gt;aromatic_amino_acid_family_metabolic_process&lt;/span&gt;    &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10118&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1555&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17870&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2155&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0034637&lt;/span&gt;      &lt;span class="mf"&gt;2.181&lt;/span&gt;   &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mf"&gt;0.0000870000&lt;/span&gt;    &lt;span class="mf"&gt;0.0094014375&lt;/span&gt;    &lt;span class="mi"&gt;9&lt;/span&gt;       &lt;span class="mi"&gt;21&lt;/span&gt;      &lt;span class="mi"&gt;21&lt;/span&gt;      &lt;span class="n"&gt;cellular_carbohydrate_biosynthetic_process&lt;/span&gt;      &lt;span class="n"&gt;cg32220&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg17725&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8890&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9485&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10688&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3495&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg10924&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg8251&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1516&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0006572&lt;/span&gt;      &lt;span class="mf"&gt;0.142&lt;/span&gt;   &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mf"&gt;0.0000970000&lt;/span&gt;    &lt;span class="mf"&gt;0.0096698421&lt;/span&gt;    &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="mi"&gt;3&lt;/span&gt;       &lt;span class="n"&gt;tyrosine_catabolic_process&lt;/span&gt;      &lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0046395&lt;/span&gt;      &lt;span class="mf"&gt;3.068&lt;/span&gt;   &lt;span class="mi"&gt;11&lt;/span&gt;      &lt;span class="mf"&gt;0.0001050000&lt;/span&gt;    &lt;span class="mf"&gt;0.0096698421&lt;/span&gt;    &lt;span class="mi"&gt;11&lt;/span&gt;      &lt;span class="mi"&gt;38&lt;/span&gt;      &lt;span class="mi"&gt;38&lt;/span&gt;      &lt;span class="n"&gt;carboxylic_acid_catabolic_process&lt;/span&gt;       &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4586&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1555&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3626&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg6638&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9709&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9527&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2155&lt;/span&gt;
&lt;span class="n"&gt;GO&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0016054&lt;/span&gt;      &lt;span class="mf"&gt;3.068&lt;/span&gt;   &lt;span class="mi"&gt;11&lt;/span&gt;      &lt;span class="mf"&gt;0.0001050000&lt;/span&gt;    &lt;span class="mf"&gt;0.0096698421&lt;/span&gt;    &lt;span class="mi"&gt;11&lt;/span&gt;      &lt;span class="mi"&gt;38&lt;/span&gt;      &lt;span class="mi"&gt;38&lt;/span&gt;      &lt;span class="n"&gt;organic_acid_catabolic_process&lt;/span&gt;  &lt;span class="n"&gt;cg7399&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4586&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg1555&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg4779&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg3626&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg6638&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9709&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9527&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9363&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg9362&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="n"&gt;cg2155&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Our results for this analysis can be found here: &lt;a href="http://gowinda.googlecode.com/files/step3_knownRecovery.zip" rel="nofollow"&gt;http://gowinda.googlecode.com/files/step3_knownRecovery.zip&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Wed, 18 Mar 2015 14:56:33 -0000</pubDate><guid>https://sourceforge.net896e640bffdb6e8190e5a34b6fb57c243d4be16d</guid></item></channel></rss>