From: <ju...@cs...> - 2005-01-25 20:15:31
|
I recently began working with the DoTS.ExternalAASequence table for the first time, and I noticed several discrepancies with this table and the DoTS.ExternalNASequence and DoTS.NASequence tables, one of which I feel is serious. name taxon_id description external_database_release_id varchar2(255) number(12) varchar2(2000) number(10) ExternalNASequence yes yes yes yes NASequence no yes yes no ExternalAASequence yes no yes yes AASequence no no yes yes Looking at the above chart, we see that both AASequence and ExternalAASequence lack the taxon_id. This surely must be an oversight, I cannot imagine why such a thing could be intentional. A minor annoyance is that NASequence and AASequence lack a "name varchar2(255)" column and therefore must have all text regarding a sequence in the description column. For query and display purposes, it is very useful to be able to refer to a sequence by a shorter handle. And a final odd observation: why does AASequence have a external_database_release_id column? If an AA sequence belongs to an external database, then it should be in the ExternalAASequence table. First, I wonder how the lack of a taxon_id column in the ExternalAASequence/AASequence tables has existed for so long. Have these tables not been used much by the GUS community? Second, I wonder if the maintainers of GUS are interested in addressing this issue and possibly resolving these discrepancies in a future GUS release. I am using Version 3.0 of the schema. Thanks, Josef Josef Jurek, Ph.D. Daphne Preuss Laboratory Molecular Genetics and Cell Biology The University of Chicago ju...@cs... voice: (773) 834-3985 fax: (773) 702-6648 |
From: Chris S. <sto...@pc...> - 2005-01-25 22:41:31
|
Hi Josef, Your comments are very timely in light of our plans to release GUS 3.5 shortly which is meant to address issues such as you raise. Mike sent mail earlier today on this but please see http://gusdb.org/wiki/index.php/Gus3.5RoadMap I think that you are right about the need for taxon_id. Note that it should probably go in AASequenceImp (the table) and be inherited by all views. Anyone have a problem with that for GUS 3.5? Note that both AASequence and ExternalAASequence are views with AASequence as the "superclass" view. external_database_release_id is a named attribute of the parent table and so is inherited by all views. Cheers, Chris On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: > > I recently began working with the DoTS.ExternalAASequence table for > the first time, and I noticed several discrepancies with this > table and the DoTS.ExternalNASequence and DoTS.NASequence tables, > one of which I feel is serious. > > name taxon_id description > external_database_release_id > varchar2(255) number(12) varchar2(2000) > number(10) > > ExternalNASequence yes yes yes yes > NASequence no yes yes no > ExternalAASequence yes no yes yes > AASequence no no yes yes > > > Looking at the above chart, we see that both AASequence > and ExternalAASequence lack the taxon_id. This surely must > be an oversight, I cannot imagine why such a thing > could be intentional. > > A minor annoyance is that NASequence and AASequence lack a > "name varchar2(255)" column and therefore must have all text > regarding a sequence in the description column. For query > and display purposes, it is very useful to be > able to refer to a sequence by a shorter handle. > > And a final odd observation: why does AASequence have > a external_database_release_id column? If an AA sequence belongs > to an external database, then it should be in the ExternalAASequence > table. > > First, I wonder how the lack of a taxon_id column in the > ExternalAASequence/AASequence tables has existed for so long. > Have these tables not been used much by the GUS community? > > Second, I wonder if the maintainers of GUS are interested > in addressing this issue and possibly resolving these > discrepancies in a future GUS release. > > I am using Version 3.0 of the schema. > > > Thanks, Josef > > > > > Josef Jurek, Ph.D. > > Daphne Preuss Laboratory > Molecular Genetics and Cell Biology > The University of Chicago > ju...@cs... > > voice: (773) 834-3985 > fax: (773) 702-6648 > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting > Tool for open source databases. Create drag-&-drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: <ju...@cs...> - 2005-01-25 23:15:17
|
Chris Stoeckert <sto...@pc...> writes: > > Hi Josef, > Your comments are very timely in light of our plans to release GUS 3.5 > shortly which is meant to address issues such as you raise. Mike sent > mail earlier today on this but please see > http://gusdb.org/wiki/index.php/Gus3.5RoadMap Great; Here is another inconsistency, which I have brought up on the list before. Both GeneFeature and NAFeature have a "name varchar2(30)" field though it is useful to store the entire text that a GenBank file may have for a feature. GeneFeature has "gene varchar2(2000)" but NAFeature lacks a varchar2(2000) field. Perhaps something such as "description varchar2(2000)" could be added to NAFeature? The NAFeature view is really hurting for a field to store a long string of text. > I think that you are right about the need for taxon_id. Note that it > should probably go in AASequenceImp (the table) and be inherited by all > views. Anyone have a problem with that for GUS 3.5? I agree that that is the right way to do it. Sometimes I mistakenly use the word table, when I mean view. Thanks, Josef > > Note that both AASequence and ExternalAASequence are views with > AASequence as the "superclass" view. external_database_release_id is a > named attribute of the parent table and so is inherited by all views. > > Cheers, > Chris > > On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: > > > > > I recently began working with the DoTS.ExternalAASequence table for > > the first time, and I noticed several discrepancies with this > > table and the DoTS.ExternalNASequence and DoTS.NASequence tables, > > one of which I feel is serious. > > > > name taxon_id description > > external_database_release_id > > varchar2(255) number(12) varchar2(2000) > > number(10) > > > > ExternalNASequence yes yes yes yes > > NASequence no yes yes no > > ExternalAASequence yes no yes yes > > AASequence no no yes yes > > > > > [...] |
From: Steve F. <sfi...@pc...> - 2005-01-26 17:16:13
|
your request for a description field seems reasonable. i'll look into it for 3.5 steve Josef Jurek wrote: >Chris Stoeckert <sto...@pc...> writes: > > >>Hi Josef, >>Your comments are very timely in light of our plans to release GUS 3.5 >>shortly which is meant to address issues such as you raise. Mike sent >>mail earlier today on this but please see >>http://gusdb.org/wiki/index.php/Gus3.5RoadMap >> >> > >Great; > >Here is another inconsistency, which I have brought up >on the list before. Both GeneFeature and NAFeature >have a "name varchar2(30)" field though it >is useful to store the entire text that a GenBank file >may have for a feature. GeneFeature has "gene varchar2(2000)" >but NAFeature lacks a varchar2(2000) field. Perhaps >something such as "description varchar2(2000)" could be >added to NAFeature? The NAFeature view is really hurting >for a field to store a long string of text. > > > >>I think that you are right about the need for taxon_id. Note that it >>should probably go in AASequenceImp (the table) and be inherited by all >>views. Anyone have a problem with that for GUS 3.5? >> >> > >I agree that that is the right way to do it. Sometimes >I mistakenly use the word table, when I mean view. > >Thanks, Josef > > > >>Note that both AASequence and ExternalAASequence are views with >>AASequence as the "superclass" view. external_database_release_id is a >>named attribute of the parent table and so is inherited by all views. >> >>Cheers, >>Chris >> >>On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: >> >> >> >>>I recently began working with the DoTS.ExternalAASequence table for >>>the first time, and I noticed several discrepancies with this >>>table and the DoTS.ExternalNASequence and DoTS.NASequence tables, >>>one of which I feel is serious. >>> >>> name taxon_id description >>>external_database_release_id >>> varchar2(255) number(12) varchar2(2000) >>>number(10) >>> >>>ExternalNASequence yes yes yes yes >>>NASequence no yes yes no >>>ExternalAASequence yes no yes yes >>>AASequence no no yes yes >>> >>> >>> >>> >>[...] >> >> > > > >------------------------------------------------------- >This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting >Tool for open source databases. Create drag-&-drop reports. Save time >by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. >Download a FREE copy at http://www.intelliview.com/go/osdn_nl >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Steve F. <sfi...@pc...> - 2005-01-26 17:14:21
|
Josef- Thanks for bringing up all these points. An answer to how AASequence has gone so long without taxon id is that historically, we have used proteins mostly in relationship to or translated from NASequences. But, clearly, they should have a taxon_id. Another place to look for schema issues is: http://gusdb.org/wiki/index.php/GusFourPointOhIdeas That page describes proposals being generated by a work-in-progress internal committee that is reviewing the schema for gus 4.0. You'll notice there that for 4.0 the proposal is to lose the distinction between internal and external sequences, making all sequences potentially external. steve Chris Stoeckert wrote: > Hi Josef, > Your comments are very timely in light of our plans to release GUS 3.5 > shortly which is meant to address issues such as you raise. Mike sent > mail earlier today on this but please see > http://gusdb.org/wiki/index.php/Gus3.5RoadMap > > I think that you are right about the need for taxon_id. Note that it > should probably go in AASequenceImp (the table) and be inherited by > all views. Anyone have a problem with that for GUS 3.5? > > Note that both AASequence and ExternalAASequence are views with > AASequence as the "superclass" view. external_database_release_id is a > named attribute of the parent table and so is inherited by all views. > > Cheers, > Chris > > On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: > >> >> I recently began working with the DoTS.ExternalAASequence table for >> the first time, and I noticed several discrepancies with this >> table and the DoTS.ExternalNASequence and DoTS.NASequence tables, >> one of which I feel is serious. >> >> name taxon_id description external_database_release_id >> varchar2(255) number(12) varchar2(2000) number(10) >> >> ExternalNASequence yes yes yes yes >> NASequence no yes yes no >> ExternalAASequence yes no yes yes >> AASequence no no yes yes >> >> >> Looking at the above chart, we see that both AASequence >> and ExternalAASequence lack the taxon_id. This surely must >> be an oversight, I cannot imagine why such a thing >> could be intentional. >> >> A minor annoyance is that NASequence and AASequence lack a >> "name varchar2(255)" column and therefore must have all text >> regarding a sequence in the description column. For query >> and display purposes, it is very useful to be >> able to refer to a sequence by a shorter handle. >> >> And a final odd observation: why does AASequence have >> a external_database_release_id column? If an AA sequence belongs >> to an external database, then it should be in the ExternalAASequence >> table. >> >> First, I wonder how the lack of a taxon_id column in the >> ExternalAASequence/AASequence tables has existed for so long. >> Have these tables not been used much by the GUS community? >> >> Second, I wonder if the maintainers of GUS are interested >> in addressing this issue and possibly resolving these >> discrepancies in a future GUS release. >> >> I am using Version 3.0 of the schema. >> >> >> Thanks, Josef >> >> >> >> >> Josef Jurek, Ph.D. >> >> Daphne Preuss Laboratory >> Molecular Genetics and Cell Biology >> The University of Chicago >> ju...@cs... >> >> voice: (773) 834-3985 >> fax: (773) 702-6648 >> >> >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting >> Tool for open source databases. Create drag-&-drop reports. Save time >> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. >> Download a FREE copy at http://www.intelliview.com/go/osdn_nl >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting > Tool for open source databases. Create drag-&-drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: <ju...@cs...> - 2005-01-26 21:07:58
|
As I continue to work with the ExternalAASequence view, I find another discrepancy with other Sequence views/tables. Compare the size the description field between NASequenceImp and AASequenceImp. DoTS.NASequenceImp DESCRIPTION VARCHAR2(2000) DoTS.AASequenceImp DESCRIPTION VARCHAR2(255) 255 characters is often two small to include the entire name of a sequence from a fasta file. Enlarging the DoTS.AASequenceImp description field to 2000 and and passing this field to the several views on AASequenceImp would be greatly appreciated as well. Thanks, Josef Chris Stoeckert <sto...@pc...> writes: > > Hi Josef, > Your comments are very timely in light of our plans to release GUS 3.5 > shortly which is meant to address issues such as you raise. Mike sent > mail earlier today on this but please see > http://gusdb.org/wiki/index.php/Gus3.5RoadMap > > I think that you are right about the need for taxon_id. Note that it > should probably go in AASequenceImp (the table) and be inherited by all > views. Anyone have a problem with that for GUS 3.5? > > Note that both AASequence and ExternalAASequence are views with > AASequence as the "superclass" view. external_database_release_id is a > named attribute of the parent table and so is inherited by all views. > > Cheers, > Chris > > On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: > > > > I recently began working with the DoTS.ExternalAASequence table for > > the first time, and I noticed several discrepancies with this > > table and the DoTS.ExternalNASequence and DoTS.NASequence tables, > > one of which I feel is serious. > > > > name taxon_id description > > external_database_release_id > > varchar2(255) number(12) varchar2(2000) > > number(10) > > > > ExternalNASequence yes yes yes yes > > NASequence no yes yes no > > ExternalAASequence yes no yes yes > > AASequence no no yes yes > > > > > > Looking at the above chart, we see that both AASequence > > and ExternalAASequence lack the taxon_id. This surely must > > be an oversight, I cannot imagine why such a thing > > could be intentional. > > > > A minor annoyance is that NASequence and AASequence lack a > > "name varchar2(255)" column and therefore must have all text > > regarding a sequence in the description column. For query > > and display purposes, it is very useful to be > > able to refer to a sequence by a shorter handle. > > > > [...] Daphne Preuss Laboratory Molecular Genetics and Cell Biology The University of Chicago ju...@cs... voice: (773) 834-3985 fax: (773) 702-6648 |
From: Chris S. <sto...@pc...> - 2005-01-26 21:11:56
|
Yes, this is something we can easily fix and certainly should. Chris ps. I think this may reflect the history and bias of CBIL's view of GUS. The history is that when we started varchars were limited to 255 but being NA-centric we fixed NASequence but not AASequence. On Jan 26, 2005, at 4:07 PM, ju...@cs... wrote: > > As I continue to work with the ExternalAASequence view, > I find another discrepancy with other Sequence views/tables. > > Compare the size the description field between NASequenceImp and > AASequenceImp. > > DoTS.NASequenceImp DESCRIPTION VARCHAR2(2000) > DoTS.AASequenceImp DESCRIPTION VARCHAR2(255) > > 255 characters is often two small to include the entire name > of a sequence from a fasta file. Enlarging the > DoTS.AASequenceImp description field to 2000 and > and passing this field to the several views on AASequenceImp > would be greatly appreciated as well. > > Thanks, Josef > > > Chris Stoeckert <sto...@pc...> writes: >> >> Hi Josef, >> Your comments are very timely in light of our plans to release GUS 3.5 >> shortly which is meant to address issues such as you raise. Mike sent >> mail earlier today on this but please see >> http://gusdb.org/wiki/index.php/Gus3.5RoadMap >> >> I think that you are right about the need for taxon_id. Note that it >> should probably go in AASequenceImp (the table) and be inherited by >> all >> views. Anyone have a problem with that for GUS 3.5? >> >> Note that both AASequence and ExternalAASequence are views with >> AASequence as the "superclass" view. external_database_release_id is a >> named attribute of the parent table and so is inherited by all views. >> >> Cheers, >> Chris >> >> On Jan 25, 2005, at 3:15 PM, ju...@cs... wrote: >>> >>> I recently began working with the DoTS.ExternalAASequence table for >>> the first time, and I noticed several discrepancies with this >>> table and the DoTS.ExternalNASequence and DoTS.NASequence tables, >>> one of which I feel is serious. >>> >>> name taxon_id description >>> external_database_release_id >>> varchar2(255) number(12) varchar2(2000) >>> number(10) >>> >>> ExternalNASequence yes yes yes yes >>> NASequence no yes yes no >>> ExternalAASequence yes no yes yes >>> AASequence no no yes yes >>> >>> >>> Looking at the above chart, we see that both AASequence >>> and ExternalAASequence lack the taxon_id. This surely must >>> be an oversight, I cannot imagine why such a thing >>> could be intentional. >>> >>> A minor annoyance is that NASequence and AASequence lack a >>> "name varchar2(255)" column and therefore must have all text >>> regarding a sequence in the description column. For query >>> and display purposes, it is very useful to be >>> able to refer to a sequence by a shorter handle. >>> >>> > [...] > > Daphne Preuss Laboratory > Molecular Genetics and Cell Biology > The University of Chicago > ju...@cs... > > voice: (773) 834-3985 > fax: (773) 702-6648 > > |