From: Chris S. <sto...@pc...> - 2006-03-16 22:50:20
|
Ed, Thanks for addressing this. The semantics in GUS for capturing operon =20= is to consider it as equivalent to a gene and thus an entry in the =20 Gene table. Multiple RNA entries can be associated with a Gene entry =20 and these can reflect the common transcript or processed transcripts =20 depending upon what occurs. Multiple Protein entries can be =20 associated with a RNA entry if there is one common transcript that is =20= translated into different proteins. In terms of loading a GenBank file, this would be loaded as a feature =20= as you describe as is true for other GenBank features. Hope this helps, Chris On Mar 15, 2006, at 2:35 PM, Ed Robinson wrote: > First, let me beg an analyst or bilogist to correct me if any > of this is wrong. > > Yes, the quick and dirty way would be to incorporate an operon > attribute in your Gene feature and point it towards a > convenient column using "column=3D". > > I also noticed that the genbank2gus.xml map does not have a > table for an operon feature. If you have operon features, in > addition to operon attributes, you will need to find a > suitable table to store these. Modify the map as you need to. > > Unfortunately, GUS was designed primarily with Eukaryotic > organisms in mind. There is nothing in GUS that I know of (if > I am wrong, some analyst please correct me) to support > super-gene organizations. If you want to group your genes into > their respective operons, the best way I can see doing that is > to store each operon with it's source_id in some table, and > then store the source_id for each genes operon in some > attribute in your GeneFeature entry. Then, you can simply > select all of the GeneFeatures with the same source_id to > recover all of the genes in one operon. The operon itself, if > you store it, will just be a big, long feature on your > NASequence within which you will find many gene records. > There will be no explicit ordering of genes in this operon via > parent_ids. You will need to use the gene locations to > determine ordering within the span of an operon. > > Unfortunately, there isn't any way to record the order that > genes appear in an operon other than to retrieve their > individual locations. There is also no hierarchical > organization of super-gene structures in gus. > InsertSequenceFeatures does support the hierarchy > > gene->mRNA->Exon > ->CDS->TranslatedAASequence > > It does this via the BioPerl unflattener. However, neither > GUS nor the unflattener supports a hierarchy such as > > ->promoter > Operon->gene..... > ->gene..... > > Of course, GUS is expandable if someone were inclined to work > on this..... > > Hope this is hepful. > > -ed > > > ---- Original message ---- >> Date: Wed, 15 Mar 2006 14:10:27 -0500 >> From: Jian Lu <jl...@vb...> >> Subject: Re: [GUSDEV] GUS schema to support Genbank features >> To: Ed Robinson <ero...@ug...> >> Cc: gus...@li... >> >> So if we want to incorporate a new qualifier "operon" into > either >> GeneFeature or Transcript, we have to add it to them. Will > GUS consider >> it for future new release? >> >> <feature name=3D"gene" table=3D"DoTS::GeneFeature" so=3D"gene"> >> <qualifier name=3D"allele" /> >> <qualifier name=3D"citation" /> >> <qualifier name=3D"evidence" /> >> <qualifier name=3D"function" /> >> <qualifier name=3D"gene" handler=3D"standard" method=3D"gene" /> >> <qualifier name=3D"label" /> >> <qualifier name=3D"locus_tag" column=3D"source_id" /> >> <qualifier name=3D"map" /> >> <qualifier name=3D"note" handler=3D"standard" method=3D"note" /> >> <qualifier name=3D"old_locus_tag" ignore=3D"true" /> >> <qualifier name=3D"operon" ignore=3D"true" /> >> <qualifier name=3D"product" /> >> <qualifier name=3D"pseudo" =0C /> >> <qualifier name=3D"phenotype" /> >> <qualifier name=3D"standard_name" /> >> <qualifier name=3D"usedin" /> >> <qualifier name=3D"db_xref" handler=3D"standard" method=3D"dbXRef" = /> >> </feature> >> >> Ed Robinson wrote: >>> The genbank2gus.xml was designed to match all GB features to >>> tables in GUS. You shouldn't find any GB feature which is not >>> in the file. You may find some feature attributes which are >>> not in the file. Feel free to add any attributes you may need. >>> >>> The mapping of genbank features to gus tables is not one to >>> one. GUS was not written to mirror GenBank. >>> >>> The genbank2gus.xml file was written as an idealized mapping >>> for some of the datasets we have seen while working on >>> ApiComplexan data sets. As such, the file is a good >>> suggesstion on how to map GenBank features into GUS. However, >>> it is not the final word on how to handle such a mapping. >>> Features such as Gene, Exon, and CDS are central to the >>> working of InsertSequenceFeatures. Thus, you should probably >>> leave those features pointing at their respective tables. If >>> you want to change one of the other features to point at a >>> different table, you should feel free to do so. You should >>> design a map that allows you to make the best use of the GUS >>> tables for your own data. >>> >>> I hope this helps. If you have any more specific questions >>> about using the map for your data, please post them. Data >>> modeling discussions are always lively. >>> >>> -ed >>> >>> >>> >>> ---- Original message ---- >>> >>>> Date: Wed, 15 Mar 2006 13:24:11 -0500 >>>> From: Jian Lu <jl...@vb...> >>>> Subject: [GUSDEV] GUS schema to support Genbank features >>>> To: gus...@li... >>>> >>>> Hi GUS, >>>> >>>> The GUS 3.5 doesn't support all Genbank features such as >>>> >>> "operon". From >>> >>>> the mapping file "genbank2gus.xml", we could see "operon" has >>>> >>> been >>> >>>> included in several feature tables. >>>> Questions: >>>> 1. Does "genbank2gus.xml" contain all Genbank features or > will? >>>> 2. How soon GUS schema will support all Genbank features or >>>> >>> at least >>> >>>> features within genbank2gus.xml >>>> 3. How will GUS plug-ins support the added Genbank features? >>>> >>>> Thanks, >>>> >>>> Jian Lu >>>> VBI >>>> >>>> >>>> ------------------------------------------------------- >>>> This SF.Net email is sponsored by xPML, a groundbreaking >>>> >>> scripting language >>> >>>> that extends applications into web and mobile media. Attend >>>> >>> the live webcast >>> >>>> and join the prime developer group breaking into this new >>>> >>> coding territory! >>> >>>> > http://sel.as-us.falkag.net/sel?=20 > cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642 >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>> ----------------- >>> Ed Robinson >>> Center for Tropical and Emerging Global Diseases >>> University of Georgia, Athens, GA 30602 >>> ero...@ug.../(706)542.1447/254.8883 >>> >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by xPML, a groundbreaking > scripting language >> that extends applications into web and mobile media. Attend > the live webcast >> and join the prime developer group breaking into this new > coding territory! >> http://sel.as-us.falkag.net/sel?=20 >> cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642 >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > ----------------- > Ed Robinson > Center for Tropical and Emerging Global Diseases > University of Georgia, Athens, GA 30602 > ero...@ug.../(706)542.1447/254.8883 > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting =20 > language > that extends applications into web and mobile media. Attend the =20 > live webcast > and join the prime developer group breaking into this new coding =20 > territory! > http://sel.as-us.falkag.net/sel?=20 > cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D121642 > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |