[XMLPipeDB-developer] yeast GO issues
Brought to you by:
kdahlquist,
zugzugglug
From: Kam D. <kda...@lm...> - 2009-12-03 19:23:19
|
Hi, You'll have seen in my previous message that MAPPFinder is stalling due to the hugeness of the GeneOntology, specifically, I think it is the GeneOntologyTree table. After I sent the e-mail, I went and looked and it has 200,000 records instead of the ~38,000 that were in the 2006 database. Obviously, this is not something we are going to be able to fix this semester, but I think that there will be a viable fix we can try to pursue with our research team next semester, which is: GO has created something called "GO Slim" where they have removed a lot of the very specific child terms so that you only get the broad categories of GO. We can try using the GO slim instead of the entire GO for our gdb. The only hitch, of course, is that it is only provided in the OBO format, not the OBO-XML format. There is a perl script called map2slim (http://search.cpan.org/~cmungall/go-perl/scripts/map2slim) that purports to take a gene associations file (like our GOA, but remains to be seen if it actually takes the GOA exact format) and re-map it to the GO Slim terms, so I think that is probably what we will need to do. I'm guessing that it will take some rounds of testing before we are sure it is working properly for us. So in the meantime, Bernie is going forward with analyzing the Arava data with the old gdb and hopefully Kenny and Don are making some headway on the gdb. Cheers, Kam |