From: Chris M. <cj...@fr...> - 2002-10-21 19:19:46
|
Have we decided on the directory organisation yet? As I see it we have a collection of files categorised along the following axes: module (sequence, genetics, etc) schema instance ("chado" is I think the name we have finalised on). I am still thinking in terms of gmod-schema being a collection of complementary but possibly redundant schemas; perhaps others would prefer to think in terms of one single gmod-schema? file-type; eg DDL as SQL statements, documentation (html, text, images, other), xml schema translation of the relational schema, adapter code, etc These could be arranged in a directory structure in the following ways: (1) (SCHEMA-TYPE-MODULE) gmod-schema/ chado/ sql/ sequence/ genetics/ expression/ this way doesn't force chado's modular divisions on the rest of gmod-schema (2) (MODULE-SCHEMA-TYPE) gmod-schema/ sequence/ chado/ sql/ sequence.sql bio-db-gff/ bio-db-gff.sql genetics/ chado/ genetics.sql this enforces the divisions Dave and I decided upon on the rest of gmod-schema; maybe not ideal, for instance, there is an argument for further subdividing what Dave and I call 'genetics' into 'phenotype'. on the other hand, it is properly modularised (3) (MODULE-TYPE) gmod-schema/ sequence/ sql/ sequence.sql docs/ sequence.txt coordinate-system.txt this is if we decide there is only one core gmod-schema (obviously still allowing databases such as bio-db-gff in project specific repositories). I'm happy with any, just making sure we're all on the same page. If pressed I'd vote for SCHEMA/MODULE/TYPE or SCHEMA/TYPE/MODULE but then if nothing else other than 'chado' is going to live here then we have an extra annoying pointless directory. On 21 Oct 2002, Scott Cain wrote: > Chris, > > While I was preparing to create a sourceforge GMOD cvs repository for > the modular schema, I came across this page: > > http://sourceforge.net/docman/display_doc.php?docid=768&group_id=1 > > About half way down the page is a section "Import of Existing CVS > Repositories," which details how to get files that are under current CVS > control into the sourceforge CVS control. Since you already have these > files under CVS control, do you want to move them directly to > sourceforge (and thus maintain their revision history), or just import > what have and lose the history? If you make the tarball as directed, you > can send it to me and I will deal with sourceforge support. > > Thanks, > Scott > > On Fri, 2002-10-18 at 15:25, Chris Mungall wrote: > > > > > > On 18 Oct 2002, Scott Cain wrote: > > > > > Dave, > > > > > > We should definitely get this stuff under cvs control. I was thinking > > > of a module named schema with doc, src and image subdirectories to hold > > > the information that is in the three tar balls that are currently on the > > > website. If nobody has any objections, that's what I'll do. > > > > That would be great, thanks. > > > > (it is under cvs at the moment, but just in a flat scratch space, the > > sooner we get it a real home the better) > > > > just to be pernickity - > > > > i expect code to live under src/ - i'd make it sql/ > > > > shall we just stick the images under doc/ > > > > should we have the doc directory mirror the modular structure of the sql > > directory, or should we just have module-specific docs > > > > the way we have it now is > > > > /sql > > sequence/ > > sequence.sql > > use-cases/ > > > > and so on > > > > sorry to be a bore about these things but it's important to get a cvs dir > > set up right as it's a pain to change the structure once it's underway! > > > > > About the phone meeting, I'll answer for Lincoln, so that you can get an > > > answer before you go home today. Lincoln is teaching a course this week > > > and next, so his time during the day is rather limited. I don't know > > > about is plans for the week after, but perhaps that is when we should > > > shoot for. > > > > > > Scott > > > > > > > > > On Fri, 2002-10-18 at 11:34, David Emmert wrote: > > > > Hi all, > > > > > > > > First of all, Scott, I had a look at the Modular Schema info you put on > > > > http://www.gmod.org, and it looks great - many thanks. I wonder if you > > > > have any ideas as to how we can go about improving whats there and making > > > > sure it is current. Should we be thinking about putting the Modular > > > > Schema into the sourceforge CVS now, and if so, how organized? > > > > > > > > Thanks also for setting up the schema mailing list. > > > > > > > > I wanted to let you all know that I've successfully loaded all of the > > > > D.melanogaster "release 3" genome annotation GenBank records into the > > > > schema, and the sequence module seems to have worked beautifully. I > > > > finished the loader just last night so I havn't completely evaluated > > > > the results, but the annotations I've looked at look good. > > > > > > > > There's at least one gene model annotation which *didn't* load properly, > > > > mod(mdg4), which is a nasty case of trans splicing who's "join" locations > > > > my location parser definitely did not appreciate. Here's what one of the > > > > mod(mdg4) GB mRNA features looks like: > > > > > > > > mRNA join(138523..138735,138795..139263, > > > > complement(154413..154524),complement(153944..154201), > > > > complement(153727..153866),complement(152185..153037)) > > > > /product="CG32491-PZ" > > > > /note="trans splicing" > > > > > > > > Parser go bung! > > > > > > > > I'm sure this case is workable in the schema, and I'll work on parsing > > > > locations of this ilk as soon as I get a chance. > > > > > > > > Lincoln, I focused on this instead of the WormBase data because in > > > > the context of our local (FlyBase) development, and learning how to > > > > layer-on the genetic/phenotypic data, we really needed to get a test-bed > > > > to work with, and it looks like a proper port of Berkeley's gadfly data > > > > is going to take some time coming. > > > > > > > > I'll take a look at the WormBase GFF and .ace data now. > > > > > > > > In the meantime, if any of you would like a postgres dump of this > > > > data to play with, please let me know. Please, everyone, be aware > > > > that the current D.melanogaster "release 3" genome annotation data > > > > in GenBank imperfect, and these imperfections (only, I hope) are > > > > obviously going to be in this test data. > > > > > > > > Once I've convinced myself I've implemented this properly, I want to > > > > start writing some practical documents on implementing data in the > > > > sequence module. Scott, others, if you have any opinions on format > > > > or content this should have, please let me know. > > > > > > > > If I get a chance, I'm going to try to get Gbrowse up and running on > > > > this data, as I'm very anxious to know how the Modular Schema and > > > > Gbrowse play together. I have no idea how easy or difficult this > > > > will be, being totally unfamiliar with Gbrowse; if anybody wants to > > > > give advice or lend a hand, please do! > > > > > > > > Finally, Lincoln mentioned we set up further conference calls, and > > > > I'd like to suggest we shoot for next Wednesday, 22 Oct, 3pm EST - > > > > same time as last time. Would that work for everybody? > > > > > > > > I'll be out of town on Monday or Tuesday, but checking mail off and > > > > on, so apologies in advance if my replies are slow in coming. > > > > > > > > Best, > > > > > > > > -Dave > > > > > > > > > > > > >> From ls...@pe... Thu Oct 10 12:56 EDT 2002 > > > > >> From: Lincoln Stein <ls...@cs...> > > > > >> To: wa...@cs..., kc...@cs... > > > > >> Subject: Action points from yesterday's conversation on the modular schema > > > > >> Date: Thu, 10 Oct 2002 12:57:32 -0400 > > > > >> User-Agent: KMail/1.4.3 > > > > >> Cc: Chris Mungall <cj...@fr...>, David Emmert <em...@mo...>, > > > > >> Scott Cain <ca...@cs...> > > > > >> MIME-Version: 1.0 > > > > >> Content-Transfer-Encoding: 8bit > > > > >> X-MIME-Autoconverted: from quoted-printable to 8bit by morgan.harvard.edu id MAA10948 > > > > >> > > > > >> Hi All, > > > > >> > > > > >> I thought our conversation yesterday about the modular schema was very > > > > >> productive, and I look forward to David setting up a schedule for further > > > > >> talks. Just a summary of the action points that we ended on: > > > > >> > > > > >> Because ideally the modular schema should support the application modules that > > > > >> we've already contributed to gmod, we're going to put together test sets for > > > > >> David and Chris to work with. > > > > >> > > > > >> 1) Lincoln to provide sequence feature data from WormBase in GFF and .ace > > > > >> format > > > > >> 2) Ken & Doreen to provide genetic map and correspondence data in the form of > > > > >> relational database table dumps > > > > >> 3) Doreen to provide curated mutants/phenotypes/alleles in some form (to be > > > > >> determined) > > > > >> 4) Scott to set up mailing list on gmod site to help coordinate this. > > > > >> > > > > >> The data sets will be submitted via e-mail to David. I will do this by > > > > >> putting a data set on an FTP site and sending the URL to David. > > > > >> > > > > >> Lincoln > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > This sf.net email is sponsored by:ThinkGeek > > > > Welcome to geek heaven. > > > > http://thinkgeek.com/sf > > > > _______________________________________________ > > > > Gmod-schema mailing list > > > > Gmo...@li... > > > > https://lists.sourceforge.net/lists/listinfo/gmod-schema > > > > > > > > > > > > |