From: David E. <em...@mo...> - 2002-10-18 15:32:56
|
Hi all, First of all, Scott, I had a look at the Modular Schema info you put on http://www.gmod.org, and it looks great - many thanks. I wonder if you have any ideas as to how we can go about improving whats there and making sure it is current. Should we be thinking about putting the Modular Schema into the sourceforge CVS now, and if so, how organized? Thanks also for setting up the schema mailing list. I wanted to let you all know that I've successfully loaded all of the D.melanogaster "release 3" genome annotation GenBank records into the schema, and the sequence module seems to have worked beautifully. I finished the loader just last night so I havn't completely evaluated the results, but the annotations I've looked at look good. There's at least one gene model annotation which *didn't* load properly, mod(mdg4), which is a nasty case of trans splicing who's "join" locations my location parser definitely did not appreciate. Here's what one of the mod(mdg4) GB mRNA features looks like: mRNA join(138523..138735,138795..139263, complement(154413..154524),complement(153944..154201), complement(153727..153866),complement(152185..153037)) /product="CG32491-PZ" /note="trans splicing" Parser go bung! I'm sure this case is workable in the schema, and I'll work on parsing locations of this ilk as soon as I get a chance. Lincoln, I focused on this instead of the WormBase data because in the context of our local (FlyBase) development, and learning how to layer-on the genetic/phenotypic data, we really needed to get a test-bed to work with, and it looks like a proper port of Berkeley's gadfly data is going to take some time coming. I'll take a look at the WormBase GFF and .ace data now. In the meantime, if any of you would like a postgres dump of this data to play with, please let me know. Please, everyone, be aware that the current D.melanogaster "release 3" genome annotation data in GenBank imperfect, and these imperfections (only, I hope) are obviously going to be in this test data. Once I've convinced myself I've implemented this properly, I want to start writing some practical documents on implementing data in the sequence module. Scott, others, if you have any opinions on format or content this should have, please let me know. If I get a chance, I'm going to try to get Gbrowse up and running on this data, as I'm very anxious to know how the Modular Schema and Gbrowse play together. I have no idea how easy or difficult this will be, being totally unfamiliar with Gbrowse; if anybody wants to give advice or lend a hand, please do! Finally, Lincoln mentioned we set up further conference calls, and I'd like to suggest we shoot for next Wednesday, 22 Oct, 3pm EST - same time as last time. Would that work for everybody? I'll be out of town on Monday or Tuesday, but checking mail off and on, so apologies in advance if my replies are slow in coming. Best, -Dave >> From ls...@pe... Thu Oct 10 12:56 EDT 2002 >> From: Lincoln Stein <ls...@cs...> >> To: wa...@cs..., kc...@cs... >> Subject: Action points from yesterday's conversation on the modular schema >> Date: Thu, 10 Oct 2002 12:57:32 -0400 >> User-Agent: KMail/1.4.3 >> Cc: Chris Mungall <cj...@fr...>, David Emmert <em...@mo...>, >> Scott Cain <ca...@cs...> >> MIME-Version: 1.0 >> Content-Transfer-Encoding: 8bit >> X-MIME-Autoconverted: from quoted-printable to 8bit by morgan.harvard.edu id MAA10948 >> >> Hi All, >> >> I thought our conversation yesterday about the modular schema was very >> productive, and I look forward to David setting up a schedule for further >> talks. Just a summary of the action points that we ended on: >> >> Because ideally the modular schema should support the application modules that >> we've already contributed to gmod, we're going to put together test sets for >> David and Chris to work with. >> >> 1) Lincoln to provide sequence feature data from WormBase in GFF and .ace >> format >> 2) Ken & Doreen to provide genetic map and correspondence data in the form of >> relational database table dumps >> 3) Doreen to provide curated mutants/phenotypes/alleles in some form (to be >> determined) >> 4) Scott to set up mailing list on gmod site to help coordinate this. >> >> The data sets will be submitted via e-mail to David. I will do this by >> putting a data set on an FTP site and sending the URL to David. >> >> Lincoln |