Re: [GUSDEV] GUS schema to support Genbank features

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ed,
Thanks for addressing this. The semantics in GUS for capturing operon =20=

is to consider it as equivalent to a gene and thus an entry in the =20
Gene table. Multiple RNA entries can be associated with a Gene entry =20
and these can reflect the common transcript or processed transcripts =20
depending upon what occurs. Multiple Protein entries can be =20
associated with a RNA entry if there is one common transcript that is =20=

translated into different proteins.

In terms of loading a GenBank file, this would be loaded as a feature =20=

as you describe as is true for other GenBank features.

Hope this helps,
Chris

On Mar 15, 2006, at 2:35 PM, Ed Robinson wrote:

> First, let me beg an analyst or bilogist to correct me if any
> of this is wrong.
>
> Yes, the quick and dirty way would be to incorporate an operon
> attribute in your Gene feature and point it towards a
> convenient column using "column=3D".
>
> I also noticed that the genbank2gus.xml map does not have a
> table for an operon feature.  If you have operon features, in
> addition to operon attributes, you will need to find a
> suitable table to store these.  Modify the map as you need to.
>
> Unfortunately, GUS was designed primarily with Eukaryotic
> organisms in mind.  There is nothing in GUS that I know of (if
> I am wrong, some analyst please correct me) to support
> super-gene organizations. If you want to group your genes into
> their respective operons, the best way I can see doing that is
> to store each operon with it's source_id in some table, and
> then store the source_id for each genes operon in some
> attribute in your GeneFeature entry.  Then, you can simply
> select all of the GeneFeatures with the same source_id to
> recover all of the genes in one operon.  The operon itself, if
> you store it, will just be a big, long feature on your
> NASequence within which you will find many gene records.
> There will be no explicit ordering of genes in this operon via
> parent_ids.  You will need to use the gene locations to
> determine ordering within the span of an operon.
>
> Unfortunately, there isn't any way to record the order that
> genes appear in an operon other than to retrieve their
> individual locations.  There is also no hierarchical
> organization of super-gene structures in gus.
> InsertSequenceFeatures does support the hierarchy
>
>    gene->mRNA->Exon
>              ->CDS->TranslatedAASequence
>
> It does this via the BioPerl unflattener.  However, neither
> GUS nor the unflattener supports a hierarchy such as
>
>       ->promoter
> Operon->gene.....
>       ->gene.....
>
> Of course, GUS is expandable if someone were inclined to work
> on this.....
>
> Hope this is hepful.
>
> -ed
>
>
> ---- Original message ----
>> Date: Wed, 15 Mar 2006 14:10:27 -0500
>> From: Jian Lu <jl...@vb...>
>> Subject: Re: [GUSDEV] GUS schema to support Genbank features
>> To: Ed Robinson <ero...@ug...>
>> Cc: gus...@li...
>>
>> So if we want to incorporate a new qualifier "operon" into
> either
>> GeneFeature or Transcript, we have to add it to them. Will
> GUS consider
>> it for future new release?
>>
>>  <feature name=3D"gene" table=3D"DoTS::GeneFeature" so=3D"gene">
>>  <qualifier name=3D"allele" />
>>  <qualifier name=3D"citation" />
>>  <qualifier name=3D"evidence" />
>>  <qualifier name=3D"function" />
>>  <qualifier name=3D"gene" handler=3D"standard" method=3D"gene" />
>>  <qualifier name=3D"label" />
>>  <qualifier name=3D"locus_tag" column=3D"source_id" />
>>  <qualifier name=3D"map" />
>>  <qualifier name=3D"note" handler=3D"standard" method=3D"note" />
>>  <qualifier name=3D"old_locus_tag" ignore=3D"true" />
>>  <qualifier name=3D"operon" ignore=3D"true" />
>>  <qualifier name=3D"product" />
>>  <qualifier name=3D"pseudo" =0C />
>>  <qualifier name=3D"phenotype" />
>>  <qualifier name=3D"standard_name" />
>>  <qualifier name=3D"usedin" />
>>  <qualifier name=3D"db_xref" handler=3D"standard" method=3D"dbXRef" =
/>
>>  </feature>
>>
>> Ed Robinson wrote:
>>> The genbank2gus.xml was designed to match all GB features to
>>> tables in GUS.  You shouldn't find any GB feature which is not
>>> in the file.  You may find some feature attributes which are
>>> not in the file.  Feel free to add any attributes you may need.
>>>
>>> The mapping of genbank features to gus tables is not one to
>>> one.  GUS was not written to mirror GenBank.
>>>
>>> The genbank2gus.xml file was written as an idealized mapping
>>> for some of the datasets we have seen while working on
>>> ApiComplexan data sets.  As such, the file is a good
>>> suggesstion on how to map GenBank features into GUS.  However,
>>> it is not the final word on how to handle such a mapping.
>>> Features such as Gene, Exon, and CDS are central to the
>>> working of InsertSequenceFeatures.  Thus, you should probably
>>> leave those features pointing at their respective tables.  If
>>> you want to change one of the other features to point at a
>>> different table, you should feel free to do so.  You should
>>> design a map that allows you to make the best use of the GUS
>>> tables for your own data.
>>>
>>> I hope this helps.  If you have any more specific questions
>>> about using the map for your data, please post them.  Data
>>> modeling discussions are always lively.
>>>
>>> -ed
>>>
>>>
>>>
>>> ---- Original message ----
>>>
>>>> Date: Wed, 15 Mar 2006 13:24:11 -0500
>>>> From: Jian Lu <jl...@vb...>
>>>> Subject: [GUSDEV] GUS schema to support Genbank features
>>>> To: gus...@li...
>>>>
>>>> Hi GUS,
>>>>
>>>> The GUS 3.5 doesn't support all Genbank features such as
>>>>
>>> "operon". From
>>>
>>>> the mapping file "genbank2gus.xml", we could see "operon" has
>>>>
>>> been
>>>
>>>> included in several feature tables.
>>>> Questions:
>>>> 1. Does "genbank2gus.xml" contain all Genbank features or
> will?
>>>> 2. How soon GUS schema will support all Genbank features or
>>>>
>>> at least
>>>
>>>> features within genbank2gus.xml
>>>> 3. How will GUS plug-ins support the added Genbank features?
>>>>
>>>> Thanks,
>>>>
>>>> Jian Lu
>>>> VBI
>>>>
>>>>
>>>> -------------------------------------------------------
>>>> This SF.Net email is sponsored by xPML, a groundbreaking
>>>>
>>> scripting language
>>>
>>>> that extends applications into web and mobile media. Attend
>>>>
>>> the live webcast
>>>
>>>> and join the prime developer group breaking into this new
>>>>
>>> coding territory!
>>>
>>>>
> http://sel.as-us.falkag.net/sel?=20
> cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642
>>>> _______________________________________________
>>>> Gusdev-gusdev mailing list
>>>> Gus...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev
>>>>
>>> -----------------
>>> Ed Robinson
>>> Center for Tropical and Emerging Global Diseases
>>> University of Georgia, Athens, GA 30602
>>> ero...@ug.../(706)542.1447/254.8883
>>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
>> that extends applications into web and mobile media. Attend
> the live webcast
>> and join the prime developer group breaking into this new
> coding territory!
>> http://sel.as-us.falkag.net/sel?=20
>> cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642
>> _______________________________________________
>> Gusdev-gusdev mailing list
>> Gus...@li...
>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev
> -----------------
> Ed Robinson
> Center for Tropical and Emerging Global Diseases
> University of Georgia, Athens, GA 30602
> ero...@ug.../(706)542.1447/254.8883
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting =20
> language
> that extends applications into web and mobile media. Attend the =20
> live webcast
> and join the prime developer group breaking into this new coding =20
> territory!
> http://sel.as-us.falkag.net/sel?=20
> cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642
> _______________________________________________
> Gusdev-gusdev mailing list
> Gus...@li...
> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev