Re: [Gmod-schema] Re: Action points from yesterday's conversation on the modular schema

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Have we decided on the directory organisation yet?

As I see it we have a collection of files categorised along the following
axes:

module (sequence, genetics, etc)

schema instance ("chado" is I think the name we have finalised on). I am
still thinking in terms of gmod-schema being a collection of complementary
but possibly redundant schemas; perhaps others would prefer to think in
terms of one single gmod-schema?

file-type; eg DDL as SQL statements, documentation (html, text,
images, other), xml schema translation of the relational schema, adapter
code, etc

These could be arranged in a directory structure in the following ways:

(1) (SCHEMA-TYPE-MODULE)

gmod-schema/
	chado/
		sql/
			sequence/
			genetics/
			expression/

this way doesn't force chado's modular divisions on the rest of
gmod-schema

(2) (MODULE-SCHEMA-TYPE)

gmod-schema/
	sequence/
		chado/
			sql/
				sequence.sql
		bio-db-gff/
			bio-db-gff.sql
	genetics/
		chado/
			genetics.sql

this enforces the divisions Dave and I decided upon on the rest of
gmod-schema; maybe not ideal, for instance, there is an argument for
further subdividing what Dave and I call 'genetics' into 'phenotype'. on
the other hand, it is properly modularised

(3) (MODULE-TYPE)

gmod-schema/
	sequence/
		sql/
			sequence.sql
		docs/
			sequence.txt
			coordinate-system.txt

this is if we decide there is only one core gmod-schema (obviously still
allowing databases such as bio-db-gff in project specific repositories).

I'm happy with any, just making sure we're all on the same page. If
pressed I'd vote for
SCHEMA/MODULE/TYPE or SCHEMA/TYPE/MODULE

but then if nothing else other than 'chado' is going to live here then we
have an extra annoying pointless directory.

On 21 Oct 2002, Scott Cain wrote:

> Chris,
>
> While I was preparing to create a sourceforge GMOD cvs repository for
> the modular schema, I came across this page:
>
> http://sourceforge.net/docman/display_doc.php?docid=768&group_id=1
>
> About half way down the page is a section "Import of Existing CVS
> Repositories," which details how to get files that are under current CVS
> control into the sourceforge CVS control.  Since you already have these
> files under CVS control, do you want to move them directly to
> sourceforge (and thus maintain their revision history), or just import
> what have and lose the history? If you make the tarball as directed, you
> can send it to me and I will deal with sourceforge support.
>
> Thanks,
> Scott
>
> On Fri, 2002-10-18 at 15:25, Chris Mungall wrote:
> >
> >
> > On 18 Oct 2002, Scott Cain wrote:
> >
> > > Dave,
> > >
> > > We should definitely get this stuff under cvs control.  I was thinking
> > > of a module named schema with doc, src and image subdirectories to hold
> > > the information that is in the three tar balls that are currently on the
> > > website.  If nobody has any objections, that's what I'll do.
> >
> > That would be great, thanks.
> >
> > (it is under cvs at the moment, but just in a flat scratch space, the
> > sooner we get it a real home the better)
> >
> > just to be pernickity -
> >
> > i expect code to live under src/ - i'd make it sql/
> >
> > shall we just stick the images under doc/
> >
> > should we have the doc directory mirror the modular structure of the sql
> > directory, or should we just have module-specific docs
> >
> > the way we have it now is
> >
> > /sql
> >     sequence/
> >              sequence.sql
> >              use-cases/
> >
> > and so on
> >
> > sorry to be a bore about these things but it's important to get a cvs dir
> > set up right as it's a pain to change the structure once it's underway!
> >
> > > About the phone meeting, I'll answer for Lincoln, so that you can get an
> > > answer before you go home today.  Lincoln is teaching a course this week
> > > and next, so his time during the day is rather limited.  I don't know
> > > about is plans for the week after, but perhaps that is when we should
> > > shoot for.
> > >
> > > Scott
> > >
> > >
> > > On Fri, 2002-10-18 at 11:34, David Emmert wrote:
> > > > Hi all,
> > > >
> > > > First of all, Scott, I had a look at the Modular Schema info you put on
> > > > http://www.gmod.org, and it looks great - many thanks.  I wonder if you
> > > > have any ideas as to how we can go about improving whats there and  making
> > > > sure it is current.  Should we be thinking about putting the Modular
> > > > Schema into the sourceforge CVS now, and if so, how organized?
> > > >
> > > > Thanks also for setting up the schema mailing list.
> > > >
> > > > I wanted to let you all know that I've successfully loaded all of the
> > > > D.melanogaster "release 3" genome annotation GenBank records into the
> > > > schema, and the sequence module seems to have worked beautifully.  I
> > > > finished the loader just last night so I havn't completely evaluated
> > > > the results, but the annotations I've looked at look good.
> > > >
> > > > There's at least one gene model annotation which *didn't* load properly,
> > > > mod(mdg4), which is a nasty case of trans splicing who's "join" locations
> > > > my location parser definitely did not appreciate.  Here's what one of the
> > > > mod(mdg4) GB mRNA features looks like:
> > > >
> > > >      mRNA            join(138523..138735,138795..139263,
> > > >                      complement(154413..154524),complement(153944..154201),
> > > >                      complement(153727..153866),complement(152185..153037))
> > > >                      /product="CG32491-PZ"
> > > >                      /note="trans splicing"
> > > >
> > > > Parser go bung!
> > > >
> > > > I'm sure this case is workable in the schema, and I'll work on parsing
> > > > locations of this ilk as soon as I get a chance.
> > > >
> > > > Lincoln, I focused on this instead of the WormBase data because in
> > > > the context of our local (FlyBase) development, and learning how to
> > > > layer-on the genetic/phenotypic data, we really needed to get a test-bed
> > > > to work with, and it looks like a proper port of Berkeley's gadfly data
> > > > is going to take some time coming.
> > > >
> > > > I'll take a look at the WormBase GFF and .ace data now.
> > > >
> > > > In the meantime, if any of you would like a postgres dump of this
> > > > data to play with, please let me know.   Please, everyone, be aware
> > > > that the current D.melanogaster "release 3" genome annotation data
> > > > in GenBank imperfect, and these imperfections (only, I hope) are
> > > > obviously going to be in this test data.
> > > >
> > > > Once I've convinced myself I've implemented this properly, I want to
> > > > start writing some practical documents on implementing data in the
> > > > sequence module.   Scott, others, if you have any opinions on format
> > > > or content this should have, please let me know.
> > > >
> > > > If I get a chance, I'm going to try to get Gbrowse up and running on
> > > > this data, as I'm very anxious to know how the Modular Schema and
> > > > Gbrowse play together.   I have no idea how easy or difficult this
> > > > will be, being totally unfamiliar with Gbrowse;  if anybody wants to
> > > > give advice or lend a hand, please do!
> > > >
> > > > Finally, Lincoln mentioned we set up further conference calls, and
> > > > I'd like to suggest we shoot for next Wednesday, 22 Oct, 3pm EST -
> > > > same time as last time.  Would that work for everybody?
> > > >
> > > > I'll be out of town on Monday or Tuesday, but checking mail off and
> > > > on, so apologies in advance if my replies are slow in coming.
> > > >
> > > > Best,
> > > >
> > > > -Dave
> > > >
> > > >
> > > > >> From ls...@pe... Thu Oct 10 12:56 EDT 2002
> > > > >> From: Lincoln Stein <ls...@cs...>
> > > > >> To: wa...@cs..., kc...@cs...
> > > > >> Subject: Action points from yesterday's conversation on the modular schema
> > > > >> Date: Thu, 10 Oct 2002 12:57:32 -0400
> > > > >> User-Agent: KMail/1.4.3
> > > > >> Cc: Chris Mungall <cj...@fr...>, David Emmert <em...@mo...>,
> > > > >>         Scott Cain <ca...@cs...>
> > > > >> MIME-Version: 1.0
> > > > >> Content-Transfer-Encoding: 8bit
> > > > >> X-MIME-Autoconverted: from quoted-printable to 8bit by morgan.harvard.edu id MAA10948
> > > > >>
> > > > >> Hi All,
> > > > >>
> > > > >> I thought our conversation yesterday about the modular schema was very
> > > > >> productive, and I look forward to David setting up a schedule for further
> > > > >> talks.  Just a summary of the action points that we ended on:
> > > > >>
> > > > >> Because ideally the modular schema should support the application modules that
> > > > >> we've already contributed to gmod, we're going to put together test sets for
> > > > >> David and Chris to work with.
> > > > >>
> > > > >> 1) Lincoln to provide sequence feature data from WormBase in GFF and .ace
> > > > >> format
> > > > >> 2) Ken & Doreen to provide genetic map and correspondence data in the form of
> > > > >> relational database table dumps
> > > > >> 3) Doreen to provide curated mutants/phenotypes/alleles in some form (to be
> > > > >> determined)
> > > > >> 4) Scott to set up mailing list on gmod site to help coordinate this.
> > > > >>
> > > > >> The data sets will be submitted via e-mail to David.  I will do this by
> > > > >> putting a data set on an FTP site and sending the URL to David.
> > > > >>
> > > > >> Lincoln
> > > >
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This sf.net email is sponsored by:ThinkGeek
> > > > Welcome to geek heaven.
> > > > http://thinkgeek.com/sf
> > > > _______________________________________________
> > > > Gmod-schema mailing list
> > > > Gmo...@li...
> > > > https://lists.sourceforge.net/lists/listinfo/gmod-schema
> > > >
> > >
> >
> >
>