Re: [Fwd: [Fwd: Re: [Gusdev-gusdev] GUSdev schema]]

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Jonathan

First, I agree on a 2-week timescale.
We're going to use a generic parser by using bioperl and populate an 
empty GUSdev instance.
The first stage will be to generate bioperl objects from any format 
(embl, genbank that bioperl recognises), then gus objects.

We have already done in the past, for pombe data, a GUS script to 
populate data from XML files, so it will just be made generic by 
changing the parsing stage.

The objects we're planning to generate are:

    * Sequence:
        => ExternalNASequence objects

    * Features:
        => GeneFeature objects,
        => ExonFeature,
        => RNAFeature,
        => TranslatedAAFeature,
        => TranslatedAASequence,
        => SignalPeptideFeature,

        => Re. Transmembrane and Pfam domains, should we generate 
PredictedAAFeature objects ?

    * Central Dogma:
        => Gene objects,
        => RNA,
        => Protein,

    * GO associations:
        => ProteinGOProcess, ProteinGOFunction, ProteinGOComponent objects.

Do you have any comments on these objects ?
What does the website expect to be populated re. Pfam and transmembrane 
domains ?

cheers
Arnaud

Jonathan Crabtree wrote:

>On Sun, 4 Aug 2002, Paul Mooney wrote:
>
>>hi all,
>>
>>I have finally got back on to a computer - no more cold turkey...
>>
>>Arnaud has told me the java layer makes some calls to perl which try to query
>>tables that do not exist in our schema because it is a couple of months older.
>>Rather than try to patch the schema and to avoid any other problems like this
>>it seems a good idea to get a point-in-time snapshot of GUSdev.
>>
>#
>
>>1st we need the schema - we are develping the EMBL parser with what we have
>>running now. The java layer will be needed once we are happy we have loaded a
>>good range of data so we can run the query servlet stuff. We will want to
>>modify this (hack :) so we can point to GeneDB gene pages in the prototype.
>>
>>Arnaud might be able to give time scales on when the data loading is in a good
>>enough state for a web interface but I imagine we will need the web stuff
>>within 2 weeks. Is this do-able?
>>
>>Does this help at all?
>>
>
>Paul-
>
>Yes, thanks.  I think 2 weeks is doable.  So basically the plan is to
>start with an empty snapshot of the latest GUSdev schema and use the EMBL
>parser to populate it?  It might also help if I could have a quick look at
>the EMBL parser, or at least just find out which tables it's loading.
>Anyway, I should have an updated set of schema creation scripts ready in
>the next day or two.
>
>Jonathan
>
>