You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(34) |
Aug
(14) |
Sep
(10) |
Oct
(10) |
Nov
(11) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(56) |
Feb
(76) |
Mar
(68) |
Apr
(11) |
May
(97) |
Jun
(16) |
Jul
(29) |
Aug
(35) |
Sep
(18) |
Oct
(32) |
Nov
(23) |
Dec
(77) |
2004 |
Jan
(52) |
Feb
(44) |
Mar
(55) |
Apr
(38) |
May
(106) |
Jun
(82) |
Jul
(76) |
Aug
(47) |
Sep
(36) |
Oct
(56) |
Nov
(46) |
Dec
(61) |
2005 |
Jan
(52) |
Feb
(118) |
Mar
(41) |
Apr
(40) |
May
(35) |
Jun
(99) |
Jul
(84) |
Aug
(104) |
Sep
(53) |
Oct
(107) |
Nov
(68) |
Dec
(30) |
2006 |
Jan
(19) |
Feb
(27) |
Mar
(24) |
Apr
(9) |
May
(22) |
Jun
(11) |
Jul
(34) |
Aug
(8) |
Sep
(15) |
Oct
(55) |
Nov
(16) |
Dec
(2) |
2007 |
Jan
(12) |
Feb
(4) |
Mar
(8) |
Apr
|
May
(19) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
(12) |
Oct
(3) |
Nov
|
Dec
|
2008 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(21) |
2009 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(19) |
Jun
(14) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(22) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Terry C. <tw...@gu...> - 2003-05-16 09:02:04
|
Jonathan, thanks. > Have you seen this message when trying to run plugins? There may also still > be some plugins that haven't been updated to GUS 3.0; this is an ongoing effort, I am not sure - but probably not. I'll keep this in mind. I am working to get the GUS setup normalized on my end. Things should be more interesting in a bit. Terry On 0, Jonathan Crabtree <cra...@pc...> wrote: > > Hi Terry- > > Terry Clark wrote: > > > >>ga GUS::Common::Similarity --help > > > > This particular module (GUS::Common::Similarity) is not a GA plugin, so you > shouldn't be able to run it using 'ga'. Looking more closely, it seems that > it's actually a set of utility routines that have not yet been updated to be > GUS 3.0 compliant (and, paradoxically, this is why you're getting those error > messages about a nonexistent package called GUS30::GUSdev). > > Have you seen this message when trying to run plugins? There may also still > be some plugins that haven't been updated to GUS 3.0; this is an ongoing effort, > and I think there have been some posts to e-mail list already covering what has > and hasn't been done so far. > > Jonathan |
From: Steve F. <st...@pc...> - 2003-05-15 17:42:18
|
I have a had a long talk with joan about this. I think it is a slightly complicated issue. Joan's approach and mine agree in that they make gene names, synonms and alias a coherent whole. The differ in that mine treats gene names as a controlled vocabulary which can exist without reference to our genes (and are associated with them through an association table) while Joan's approach understands them to be values directly associated with genes, and having no life in our db otherwise. For example, on my approach a gene name can be loaded into the db even if the db does not contain the relevant gene, or the relevant gene has not been figured out yet. Ie, it is truly a controlled vocab. On joan's approach, a gene name belongs to exactly one gene. (However, the gene name table might hold multiple rows which contain the same name value and/or symbol value. That is, the name and symbol are specifically not alternate keys.) Joan's approach has the advantage of being simpler (at least fewer tables), and mine is only necessary if we do indeed want to treat the gene names as a controlled vocab. steve Joan Mazzarelli wrote: > Hi all, > > Steve Fischer wrote: > >> The model I was going for is a controlled vocab, ie, that gene names, >> symbols and synonyms are knowable without reference to a Gene object. >> The act of associating a Name with a Gene is "annotating" the Gene, >> and may be tentative. And, there may be more than one Gene that >> tentatively lays claim to that name (eg across species?). IF that is >> the model we are going for, then i don't think i agree that synonyms >> should reference a gene directly. > > > The effort to assign approved gene symbols and gene names at least by > MGI and HUGO is to assign unique gene symbol and gene names to a gene. > They research a gene name or symbol prior to its approved assignment > to a gene. > > A Non Approved gene name or symbol may possibly be assigned to more > than one gene. I am inclined to say that even though this may happen > we should still have the gene_id referenced. > By calling it a gene synonym or alias, we saying that it is an > alternative designation for the gene. > >> >> We have seen a similar problem with "reference" sequence, ie, chosing >> one of a set to be representative. > > > > This is true but we are phasing this out by creating a gene > model/sequence instead of a choosing a reference RNA. > >> >> Here is how i think it can work (my original w/ modfications as per >> this discussion and pending Joan's explanation of aliases). The >> GeneName has a boolean 'approved' attribute. If it is set, then that >> is the approved name. Otherwise, the GeneName is equal to its >> synonyms, but has been (arbitrarily) chosen as the representative. >> (The other way to do this is to lose GeneName.is_approved and allow >> GeneName.name and GeneName.symbol be nullable, indicating that there >> is no approved name yet). > > > > I have made some changes in the text below and removed a table: > > > 1.GeneSymbol table > >> gene_symbol_id >> gene_id >> symbol -- the symbol (a gene can have more than >> one symbol but only one is approved) >> is_approved -- boolean (point to evidence of why this is >> the approved symbol, if MGI gene symbol for example) > > > review_status_id (manually reviewed = 1, from external base > (not reviewed) =2, updated = 3) > external_db_id (where this symbol was obtained from or > external_db_release_id) > >> >> 2.GeneFullName table : > > >> >> gene_fullname_id >> gene_id name -- the full name of the gene >> is_approved (a gene can only have one approved full name, point >> to evidence) > > > review_status_id > external_db_id (where this name was obtained from) > >> >> >> is_not gene_name_type_id -- points to a controlled vocab >> of gene name types such as mentioned by Arnaud. > > > Arnaud, Is is_not necesssary, if in your case, the is_approved is > changed from one gene symbol to another, with the addition of evidence > of why this was done? > also what are the controlled vocabulary types? > > Anyway, I am inclined to think that a symbol and a full name of the > gene should have the gene_id referenced in the table > > Joan > > >> >> Jonathan Crabtree wrote: >> >>> >>> Steve Fischer wrote: >>> >>>> right now in GUS, we have a bunch of tables and attribute that >>>> relate to gene symbols, names and aliases: >>>> >>>> Dots::Gene.name >>>> Dots::Gene.gene_symbol >>>> Dots::GeneAlias >>>> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >>>> intended to store references to external database entries. it is >>>> hackish to encode in the schema that we assume that such entries >>>> are gene records. they could easily be proteins or journals, >>>> whatever) >>> >>> >>> >>> >>> Yes, this is definitely a hack; I added some columns to the DbRef table >>> because I wanted to store 2-3 specific pieces of information for MGI >>> and >>> GeneCards entries, without creating another table. However, I disagree >>> that I "encoded" in the schema the assumption that these DbRef entries >>> are gene records; I think if you look more closely you will see that >>> all >>> of the newly-added columns (gene_symbol, chromosome, centimorgans) are >>> NULLable. Therefore the only assumption I am making is that one or >>> more >>> of these columns *may* be applicable to certain DbRefs. >>> >>>> 1. introduce a GeneName table: >>>> GeneName.gene_name_id >>>> GeneName.name --- the full name >>>> GeneName.symbol -- the symbol >>>> >>>> 2. introduce a GeneSynonym table: >>>> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >>>> GeneSynonym.name -- the full name of the synonym >>>> GeneSynonym.symbol -- the symbol >>> >>> >>> >>> >>> Arnaud's point that a gene may have names, but no approved name is a >>> good >>> one. It suggests that GeneSynonym should reference Gene, not GeneName. >>> We might also consider renaming "GeneName" to "ApprovedGeneName" and >>> "GeneSynonym" to "GeneName". Arnaud's second point, that there are >>> potentially several different categories of names, suggests that we >>> follow the example of the TaxonName table, and add a 'name_class' >>> column >>> to GeneSynonym. (This could also be a controlled vocabulary.) Then I >>> think the only remaining question is whether we are sure that the only >>> kinds of approved names we will ever have are "gene name" and "gene >>> symbol". >>> >>> Jonathan >>> >>> >>> >>> ------------------------------------------------------- >>> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >>> The only event dedicated to issues related to Linux enterprise >>> solutions >>> www.enterpriselinuxforum.com >>> >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> >> >> >> >> >> >> ------------------------------------------------------- >> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >> The only event dedicated to issues related to Linux enterprise solutions >> www.enterpriselinuxforum.com >> >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > |
From: Steve F. <st...@pc...> - 2003-05-15 17:04:08
|
Folks- Here are the three changes that Dave and I have come across in the creation of apparatus to maintain manual GO associations across builds of the DoTS transcript index, and across subsequent versions of the GO hierarchy: 1. Dots.GOAssociationInstance: a. rename 'defining' to 'isPrimary' (indicates that instance was produced by a program such as a predictor or annatotor, as opposed to being introduced by the caching of instances in ancestor associations to speed querying) b. add 'deprecated' attribute. (means that this instance has been superceded by a newer run of the same predictor) 2. add new table: Dots.RereviewInstance rereview_instance_id handled (boolean) table_id row_id a row in this table points to a row in any other table that has a review_status_id. it indicates that a program has discovered a reason that the review status should be re-reviewed. in the example of go associations, we may have multiple "smart" re-review discoverers that sniff around for suspicious associations (eg, underlying evidence has changed), and add a RereviewInstance. When a reviewer returns and takes care of the problem, the 'handled' bit is set steve |
From: MICHAEL L. <lu...@cs...> - 2003-05-15 16:57:55
|
Hello Dave- Here is the error: [luchtan@mango gbparserFailures]$ ga GUS::GOPredict::Plugin::LoadGoOntology --create_release --file_path=/home/gusdev/gus3.0-checkouts/GO --id_file=/home/luchtan/GOLOG Reading properties from /home/gus_home/config/GUS-PluginMgr.prop Reading properties from /home/luchtan/gus.properties DBD::Oracle::db do failed: ORA-30019: Illegal rollback Segment operation in Automatic Undo mode (DBD ERROR: OCIStmtExecute) at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm line 149. DBD::Oracle::db do failed: ORA-30019: Illegal rollback Segment operation in Automatic Undo mode (DBD ERROR: OCIStmtExecute) at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm line 149. DBD::Oracle::db do failed: ORA-30019: Illegal rollback Segment operation in Automatic Undo mode (DBD ERROR: OCIStmtExecute) at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm line 149. DBD::Oracle::db do failed: ORA-30019: Illegal rollback Segment operation in Automatic Undo mode (DBD ERROR: OCIStmtExecute) at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm line 149. Thu May 15 00:48:36 2003 loading all .ontology files in /home/gusdev/gus3.0-checkouts/GO in preparation for parsing Thu May 15 00:48:36 2003 parsing all .ontology files in preparation for inserting into database Thu May 15 00:48:44 2003 parsing finished; loading ontology into database DBD::Oracle::db prepare failed: ORA-00921: unexpected end of SQL command (DBD ERROR: OCIStmtExecute/Describe) at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDbHandle.pm line 77. prepareAndExecute FAILED: GUS::ObjRelP::DbiDbHandle=HASH(0x81d2150)->errstr Can't call method "execute" without a package or object reference at /home/gus_home/lib/perl/GUS/ObjRelP/DbiDbHandle.pm line 78. Some kind of referencing error? Thanks, Michael Luchtan http://www.cs.uga.edu/~luchtan |
From: Jonathan C. <cra...@pc...> - 2003-05-15 16:17:30
|
Hi Terry- Terry Clark wrote: > >>ga GUS::Common::Similarity --help > This particular module (GUS::Common::Similarity) is not a GA plugin, so you shouldn't be able to run it using 'ga'. Looking more closely, it seems that it's actually a set of utility routines that have not yet been updated to be GUS 3.0 compliant (and, paradoxically, this is why you're getting those error messages about a nonexistent package called GUS30::GUSdev). Have you seen this message when trying to run plugins? There may also still be some plugins that haven't been updated to GUS 3.0; this is an ongoing effort, and I think there have been some posts to e-mail list already covering what has and hasn't been done so far. Jonathan |
From: Steve F. <st...@pc...> - 2003-05-15 16:15:47
|
I talked w/ chetna about this. She and Michael tweeked the perl code to force it to format dates in the format their server expects. So, their data is fine. steve Jonathan Crabtree wrote: > > > Chetna D. Warade wrote: > >> date issue. Can we use --updateALL option in GBParser as the only >> change is the created_date? > > > Chetna- > > Didn't the date error prevent those rows from being entered? I think > you may > have to reload them instead of just doing an update, although I'm not > sure. > Presumably you can query the database to determine whether this is the > case. > > Jonathan > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Jonathan C. <cra...@pc...> - 2003-05-15 16:10:35
|
Chetna D. Warade wrote: > date issue. Can we use --updateALL option in GBParser as the only change > is > the created_date? Chetna- Didn't the date error prevent those rows from being entered? I think you may have to reload them instead of just doing an update, although I'm not sure. Presumably you can query the database to determine whether this is the case. Jonathan |
From: Joan M. <ma...@pc...> - 2003-05-15 15:23:58
|
Hi all, Steve Fischer wrote: > The model I was going for is a controlled vocab, ie, that gene names, > symbols and synonyms are knowable without reference to a Gene object. > The act of associating a Name with a Gene is "annotating" the Gene, > and may be tentative. And, there may be more than one Gene that > tentatively lays claim to that name (eg across species?). IF that is > the model we are going for, then i don't think i agree that synonyms > should reference a gene directly. The effort to assign approved gene symbols and gene names at least by MGI and HUGO is to assign unique gene symbol and gene names to a gene. They research a gene name or symbol prior to its approved assignment to a gene. A Non Approved gene name or symbol may possibly be assigned to more than one gene. I am inclined to say that even though this may happen we should still have the gene_id referenced. By calling it a gene synonym or alias, we saying that it is an alternative designation for the gene. > > We have seen a similar problem with "reference" sequence, ie, chosing > one of a set to be representative. This is true but we are phasing this out by creating a gene model/sequence instead of a choosing a reference RNA. > > Here is how i think it can work (my original w/ modfications as per > this discussion and pending Joan's explanation of aliases). The > GeneName has a boolean 'approved' attribute. If it is set, then that > is the approved name. Otherwise, the GeneName is equal to its > synonyms, but has been (arbitrarily) chosen as the representative. > (The other way to do this is to lose GeneName.is_approved and allow > GeneName.name and GeneName.symbol be nullable, indicating that there > is no approved name yet). I have made some changes in the text below and removed a table: 1.GeneSymbol table > gene_symbol_id > gene_id > symbol -- the symbol (a gene can have more than one > symbol but only one is approved) > is_approved -- boolean (point to evidence of why this is > the approved symbol, if MGI gene symbol for example) review_status_id (manually reviewed = 1, from external base (not reviewed) =2, updated = 3) external_db_id (where this symbol was obtained from or external_db_release_id) > > 2.GeneFullName table : > > gene_fullname_id > gene_id > name -- the full name of the gene > is_approved (a gene can only have one approved full name, point > to evidence) review_status_id external_db_id (where this name was obtained from) > > > is_not > gene_name_type_id -- points to a controlled vocab of gene > name types such as mentioned by Arnaud. Arnaud, Is is_not necesssary, if in your case, the is_approved is changed from one gene symbol to another, with the addition of evidence of why this was done? also what are the controlled vocabulary types? Anyway, I am inclined to think that a symbol and a full name of the gene should have the gene_id referenced in the table Joan > > Jonathan Crabtree wrote: > >> >> Steve Fischer wrote: >> >>> right now in GUS, we have a bunch of tables and attribute that >>> relate to gene symbols, names and aliases: >>> >>> Dots::Gene.name >>> Dots::Gene.gene_symbol >>> Dots::GeneAlias >>> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >>> intended to store references to external database entries. it is >>> hackish to encode in the schema that we assume that such entries are >>> gene records. they could easily be proteins or journals, whatever) >> >> >> >> Yes, this is definitely a hack; I added some columns to the DbRef table >> because I wanted to store 2-3 specific pieces of information for MGI and >> GeneCards entries, without creating another table. However, I disagree >> that I "encoded" in the schema the assumption that these DbRef entries >> are gene records; I think if you look more closely you will see that all >> of the newly-added columns (gene_symbol, chromosome, centimorgans) are >> NULLable. Therefore the only assumption I am making is that one or more >> of these columns *may* be applicable to certain DbRefs. >> >>> 1. introduce a GeneName table: >>> GeneName.gene_name_id >>> GeneName.name --- the full name >>> GeneName.symbol -- the symbol >>> >>> 2. introduce a GeneSynonym table: >>> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >>> GeneSynonym.name -- the full name of the synonym >>> GeneSynonym.symbol -- the symbol >> >> >> >> Arnaud's point that a gene may have names, but no approved name is a >> good >> one. It suggests that GeneSynonym should reference Gene, not GeneName. >> We might also consider renaming "GeneName" to "ApprovedGeneName" and >> "GeneSynonym" to "GeneName". Arnaud's second point, that there are >> potentially several different categories of names, suggests that we >> follow the example of the TaxonName table, and add a 'name_class' column >> to GeneSynonym. (This could also be a controlled vocabulary.) Then I >> think the only remaining question is whether we are sure that the only >> kinds of approved names we will ever have are "gene name" and "gene >> symbol". >> >> Jonathan >> >> >> >> ------------------------------------------------------- >> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >> The only event dedicated to issues related to Linux enterprise solutions >> www.enterpriselinuxforum.com >> >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Chetna D. W. <ch...@ug...> - 2003-05-15 14:45:32
|
Hi Debbie, Yesterday we (Michael and me) made a one line change in the GBParser too avoid Oracle date error and we want to rerun the GBParser to take care of the date issue. Can we use --updateALL option in GBParser as the only change is the created_date? thanks, chetna ; name="" Content-Transfer-Encoding: BASE64 DQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tMTY4MDcxNTA4OTQ0MjQ5 |
From: Steve F. <st...@pc...> - 2003-05-15 14:06:04
|
chetna- take a look at the tail of the file. maybe there is some junk there. steve Chetna D. Warade wrote: >Hey all, > >I observed one thing about GBParser and would love to know why this >happens. We had 28221 records in the GenBank file. When run with >GBParser it tries to look for 28222nd record and gives an error "Cant >find getAccession() ...." and then gives status that 28221 updated and >one failed and exits anyways. > >Hope to hear from you all, >Chetna > > > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Terry C. <tw...@gu...> - 2003-05-14 23:35:17
|
With a new installation of GUS created from the tarball at http://cvsweb.sanger.ac.uk/, I run into problems with references to GUS30, which is not in the files I have. What am I missing? Any insight would be greatly appreciated. Terry > ga GUS::Common::Similarity --help Reading properties from /home2/gus/run/config/GUS-PluginMgr.prop ERROR: Can't locate GUS30/GUSdev/RNASequence.pm in @INC (@INC contains: /home2/gus/run/lib/perl /home2/gus/projects /usr/local/lib/perl5/5.8.0/sun4-solaris /usr/local/lib/perl5/5.8.0 /usr/local/lib/perl5/site_perl/5.8.0/sun4-solaris /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl .) at /home2/gus/run/lib/perl/GUS/Common/Similarity.pm line 19. BEGIN failed--compilation aborted at /home2/gus/run/lib/perl/GUS/Common/Similarity.pm line 19. Compilation failed in require at (eval 1) line 1. --------------------------- STACK TRACE ------------------------- GUS::PluginMgr::Plugin::error('GUS::PluginMgr::GusApplication=HASH(0x141da4)','Can\'t locate GUS30/GUSdev/RNASequence.pm in @INC (@INC conta...') called at /home2/gus/run/lib/perl/GUS/PluginMgr/GusApplication.pm line 248 GUS::PluginMgr::GusApplication::newFromPluginName('GUS::PluginMgr::GusApplication=HASH(0x141da4)','GUS::Common::Similarity') called at /home2/gus/run/lib/perl/GUS/PluginMgr/GusApplication.pm line 361 GUS::PluginMgr::GusApplication::doMajorMode_Run('GUS::PluginMgr::GusApplication=HASH(0x141da4)','GUS::Common::Similarity') called at /home2/gus/run/lib/perl/GUS/PluginMgr/GusApplication.pm line 283 GUS::PluginMgr::GusApplication::doMajorMode('GUS::PluginMgr::GusApplication=HASH(0x141da4)','GUS::Common::Similarity') called at /home2/gus/run/lib/perl/GUS/PluginMgr/GusApplication.pm line 192 GUS::PluginMgr::GusApplication::parseAndRun('GUS::PluginMgr::GusApplication=HASH(0x141da4)','ARRAY(0x14a22c)') called at /home2/gus/run/bin/ga line 11 |
From: Chetna D. W. <ch...@ug...> - 2003-05-14 20:46:43
|
Hey all, I observed one thing about GBParser and would love to know why this happens. We had 28221 records in the GenBank file. When run with GBParser it tries to look for 28222nd record and gives an error "Cant find getAccession() ...." and then gives status that 28221 updated and one failed and exits anyways. Hope to hear from you all, Chetna |
From: Jonathan C. <cra...@pc...> - 2003-05-14 18:58:03
|
Steve Fischer wrote: > I think i am ok with interspersed responses... as long as they leave the > whole original there. its the cutting that is the real problem. i just > don't want to have to dig around among old mails to find the rest of the > thread. You shouldn't have to do any digging so long as your e-mail client lets you read your e-mail in threaded mode. The following page covers some of the reasons why always including the whole message can be a bad idea (section 2.1): <http://learn.to/quote/> I think they're mostly applicable to e-mail as well as Usenet postings, particularly since the advent of threaded e-mail clients. Jonathan |
From: Steve F. <st...@pc...> - 2003-05-14 18:37:33
|
I think i am ok with interspersed responses... as long as they leave the whole original there. its the cutting that is the real problem. i just don't want to have to dig around among old mails to find the rest of the thread. steve steve Jonathan Crabtree wrote: > > Steve- > > Steve Fischer wrote: > >> I have another suggestion to add: >> >> that we always add our reponse to the top of the message, and never >> cut out parts of it. ie, that going to any individual mail in the >> thread contains the entire thread up till that point. otherwise, it >> is a pain to go searching for the parts of the thread that have been >> cut out. > > > I'm not sure I agree with this; top-posting is generally considered a bad > thing because it's far easier to understand someone's response > (particularly > when they are responding to several distinct points) when the > different parts > of the response are interjected at the correct points in the quoted text > (with a blank line before and after to make it easy to find the new > material.) > > Similarly, I think that the generally-accepted practice is to allow > responders to excise parts of the quoted message as they see fit (e.g., > if they only want to comment on a small part of it.) I don't think the > tracking problem is that bad, because as long as we follow the > previously- > described protocol for replies, you should be able to track the e-mail > thread using a threaded e-mail reader (I don't use this feature too much, > but I know that Mozilla supports it.) > > I'm pretty sure that the above reflects the accepted wisdom for Usenet > posting etiquette, and I believe that it probably also applies to e-mails > on the gusdev list, which is effectively a newsgroup of sorts. Search > google for "top-posting" and you'll see a bunch of articles on the > subject. > > Jonathan |
From: Jonathan C. <cra...@pc...> - 2003-05-14 18:29:31
|
Steve- Steve Fischer wrote: > I have another suggestion to add: > > that we always add our reponse to the top of the message, and never cut > out parts of it. ie, that going to any individual mail in the thread > contains the entire thread up till that point. otherwise, it is a pain > to go searching for the parts of the thread that have been cut out. I'm not sure I agree with this; top-posting is generally considered a bad thing because it's far easier to understand someone's response (particularly when they are responding to several distinct points) when the different parts of the response are interjected at the correct points in the quoted text (with a blank line before and after to make it easy to find the new material.) Similarly, I think that the generally-accepted practice is to allow responders to excise parts of the quoted message as they see fit (e.g., if they only want to comment on a small part of it.) I don't think the tracking problem is that bad, because as long as we follow the previously- described protocol for replies, you should be able to track the e-mail thread using a threaded e-mail reader (I don't use this feature too much, but I know that Mozilla supports it.) I'm pretty sure that the above reflects the accepted wisdom for Usenet posting etiquette, and I believe that it probably also applies to e-mails on the gusdev list, which is effectively a newsgroup of sorts. Search google for "top-posting" and you'll see a bunch of articles on the subject. Jonathan |
From: Joan M. <ma...@pc...> - 2003-05-14 18:17:45
|
Steve, Alias is another gene name for a gene. Synonym is another gene_symbol for a gene. Perhaps to make it clearer it should be gene_symbol alternative (i.e. not approved) instead of synonym. and gene name alternative instead of alias. Joan Steve Fischer wrote: > Joan- > > can you explain what a gene alias is as opposed to a gene synonym? > > thanks, > steve > > Joan Mazzarelli wrote: > >> Hi all, >> >> I thought the point of this discussion was to figure out how to >> integrate into the tables which contain (or were created to contain) >> manual gene annotation assignments >> the gene information which we get from MGI/Gene cards sequence >> mappings. (although we may want to make recreate these tables for >> this and/or if PSU has certain needs) . >> >> BTW, although a gene symbol is approved it can also change (MGI >> versions for instance and also they have -pending), so this is >> another case where changes can occur. As it stands now, in the gene >> table we have the attribute gene_symbol where the approved human or >> mouse gene symbol is written for each gene when added by the annotator. >> Also, in the gene table there is name, where I envisioned using the >> new annotation tool, the approved gene name would be written. >> approved gene_symbol = Fzd4 >> approved gene name = frizzled homolog 4 (Drosophila) >> >> >> https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::Gene&path=DoTS::Gene >> >> >> Now for the Current dots.GeneSynonym table, the annotator can add >> gene symbol synonyms for the gene and this is where they are written. >> >> https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneSynonym&path=DoTS::GeneSynonym >> >> >> I created GeneAlias for other (not approved) gene names for a gene to >> be used by the new annotation tool. >> >> https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneAlias&path=DoTS::GeneAlias >> >> >> >> For genes, they can have gene symbol synonyms and also gene name >> aliases. >> It not necessarily the case where every gene symbol synonym has a >> gene name alias which corresponds to it, as in the approved case above >> or vice versa. >> >> >> >> >> Joan >> >> >> >> Jonathan Crabtree wrote: >> >>> >>> Steve Fischer wrote: >>> >>>> right now in GUS, we have a bunch of tables and attribute that >>>> relate to gene symbols, names and aliases: >>>> >>>> Dots::Gene.name >>>> Dots::Gene.gene_symbol >>>> Dots::GeneAlias >>>> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >>>> intended to store references to external database entries. it is >>>> hackish to encode in the schema that we assume that such entries >>>> are gene records. they could easily be proteins or journals, >>>> whatever) >>> >>> >>> >>> >>> Yes, this is definitely a hack; I added some columns to the DbRef table >>> because I wanted to store 2-3 specific pieces of information for MGI >>> and >>> GeneCards entries, without creating another table. However, I disagree >>> that I "encoded" in the schema the assumption that these DbRef entries >>> are gene records; I think if you look more closely you will see that >>> all >>> of the newly-added columns (gene_symbol, chromosome, centimorgans) are >>> NULLable. Therefore the only assumption I am making is that one or >>> more >>> of these columns *may* be applicable to certain DbRefs. >>> >>>> 1. introduce a GeneName table: >>>> GeneName.gene_name_id >>>> GeneName.name --- the full name >>>> GeneName.symbol -- the symbol >>>> >>>> 2. introduce a GeneSynonym table: >>>> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >>>> GeneSynonym.name -- the full name of the synonym >>>> GeneSynonym.symbol -- the symbol >>> >>> >>> >>> >>> Arnaud's point that a gene may have names, but no approved name is a >>> good >>> one. It suggests that GeneSynonym should reference Gene, not GeneName. >>> We might also consider renaming "GeneName" to "ApprovedGeneName" and >>> "GeneSynonym" to "GeneName". Arnaud's second point, that there are >>> potentially several different categories of names, suggests that we >>> follow the example of the TaxonName table, and add a 'name_class' >>> column >>> to GeneSynonym. (This could also be a controlled vocabulary.) Then I >>> think the only remaining question is whether we are sure that the only >>> kinds of approved names we will ever have are "gene name" and "gene >>> symbol". >>> >>> Jonathan >>> >>> >>> >>> ------------------------------------------------------- >>> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >>> The only event dedicated to issues related to Linux enterprise >>> solutions >>> www.enterpriselinuxforum.com >>> >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >> >> >> >> >> ------------------------------------------------------- >> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >> The only event dedicated to issues related to Linux enterprise solutions >> www.enterpriselinuxforum.com >> >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > |
From: Steve F. <st...@pc...> - 2003-05-14 18:12:23
|
I have another suggestion to add: that we always add our reponse to the top of the message, and never cut out parts of it. ie, that going to any individual mail in the thread contains the entire thread up till that point. otherwise, it is a pain to go searching for the parts of the thread that have been cut out. steve Arnaud Kerhornou wrote: >Hi Steve > >Sounds find to me but re. 1. I would keep the main recipient to specify who it >is adressed to and cc gusdev mailing list. > >Arnaud > >Selon Steve Fischer <sfi...@pc...>: > > > >>hey gusdevers- >> >>i wanted to throw out there a proposal for how we can all use this >>mailing list, possibly to best effect. comments and tomatoes encouraged. >> >>my idea is: >> 1. when responding to mail on the mail list, address your response >>*only* to the news group. thus, if you hit "reply all", remove all the >>recipients who are not gusdev-gusdev. This gets rid of the redundant >>mail traffic. >> >> 2. unless your response is really of no interest to the group, avoid >>providing private answers. this way, everybody will know that the >>question has been answered and won't spend their time providing another >>one, and, the answer will be available on the archive for future inquiries. >> >> 3. don't be too shy about re-subjecting a thread if it has veered off >>the original topic >> >>steve >> >> >> >> > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Deborah F. P. <pi...@pc...> - 2003-05-14 18:12:02
|
Hi Chetna, In the case of this particular plugin, you won't get duplicates created in the db by running it again. However, there is an option, --start, that allows you to stipulate where you left off in a particular file and then continue from there saving you running time. You should have a log file with ouput from the plugin that looks something like this: STATUS: N=17 ACCS=AL671875 TOTAL_OBJECTS=123 TIME=Thu Feb 6 14:07:19 EST 2003 INSERTED: AL671875; N=17 Use the value of N for the last completed entry for --start. Debbie On Wed, 14 May 2003, Chetna D. Warade wrote: > > Hey all, > > I am working with Michael to load GenBank stuff in gus 3.0. > Right our database is out of tablespace. My question is: Does repetitive > use of plugin/GBParser resume where it left off or will it try to load > everything from scratch. > > Situation here: > Due to limited tablespace we could successfully load 18079 rows in the > database (dots.ExternalNASequence and Dots.NAEntry). I am adding more > tablespace and then re-run the GBParser on the same GenBank file. At the > least I expect the primary key failure error for first 18079 rows and > then > GBParser should be able to load the remaining ones. > > It would be great if someone can give insight. > > Thanks, > Chetna > > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Steve F. <st...@pc...> - 2003-05-14 18:09:51
|
The model I was going for is a controlled vocab, ie, that gene names, symbols and synonyms are knowable without reference to a Gene object. The act of associating a Name with a Gene is "annotating" the Gene, and may be tentative. And, there may be more than one Gene that tentatively lays claim to that name (eg across species?). IF that is the model we are going for, then i don't think i agree that synonyms should reference a gene directly. We have seen a similar problem with "reference" sequence, ie, chosing one of a set to be representative. Here is how i think it can work (my original w/ modfications as per this discussion and pending Joan's explanation of aliases). The GeneName has a boolean 'approved' attribute. If it is set, then that is the approved name. Otherwise, the GeneName is equal to its synonyms, but has been (arbitrarily) chosen as the representative. (The other way to do this is to lose GeneName.is_approved and allow GeneName.name and GeneName.symbol be nullable, indicating that there is no approved name yet). 1. GeneName table: gene_name_id name -- the full name symbol -- the symbol is_approved -- boolean 2.GeneSynonym table: gene_synonym_id gene_name_id -- the GeneName it is a synonym for name -- the full name of the synonym symbol -- the symbol 3. GeneNameAssociation table -- a mapping between Gene and GeneName (better name for this??) gene_id gene_name_id review_status_id is_not gene_name_type_id -- points to a controlled vocab of gene name types such as mentioned by Arnaud. probably adopt here an instance and evidence mechanism similar to go assocation. Jonathan Crabtree wrote: > > Steve Fischer wrote: > >> right now in GUS, we have a bunch of tables and attribute that relate >> to gene symbols, names and aliases: >> >> Dots::Gene.name >> Dots::Gene.gene_symbol >> Dots::GeneAlias >> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >> intended to store references to external database entries. it is >> hackish to encode in the schema that we assume that such entries are >> gene records. they could easily be proteins or journals, whatever) > > > Yes, this is definitely a hack; I added some columns to the DbRef table > because I wanted to store 2-3 specific pieces of information for MGI and > GeneCards entries, without creating another table. However, I disagree > that I "encoded" in the schema the assumption that these DbRef entries > are gene records; I think if you look more closely you will see that all > of the newly-added columns (gene_symbol, chromosome, centimorgans) are > NULLable. Therefore the only assumption I am making is that one or more > of these columns *may* be applicable to certain DbRefs. > >> 1. introduce a GeneName table: >> GeneName.gene_name_id >> GeneName.name --- the full name >> GeneName.symbol -- the symbol >> >> 2. introduce a GeneSynonym table: >> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >> GeneSynonym.name -- the full name of the synonym >> GeneSynonym.symbol -- the symbol > > > Arnaud's point that a gene may have names, but no approved name is a good > one. It suggests that GeneSynonym should reference Gene, not GeneName. > We might also consider renaming "GeneName" to "ApprovedGeneName" and > "GeneSynonym" to "GeneName". Arnaud's second point, that there are > potentially several different categories of names, suggests that we > follow the example of the TaxonName table, and add a 'name_class' column > to GeneSynonym. (This could also be a controlled vocabulary.) Then I > think the only remaining question is whether we are sure that the only > kinds of approved names we will ever have are "gene name" and "gene > symbol". > > Jonathan > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Jonathan C. <cra...@pc...> - 2003-05-14 18:00:16
|
Hi Chetna- Chetna D. Warade wrote: > I am working with Michael to load GenBank stuff in gus 3.0. > Right our database is out of tablespace. My question is: Does repetitive > use of plugin/GBParser resume where it left off or will it try to load > everything from scratch. This is something that has to be coded on a per-plugin basis, meaning that unless the authors of the GBParser plugin have explicitly given it the ability to restart cleanly, it won't. Or rather, most plugins will probably run a second time without complaint, but will likely create duplicate rows in the database. Whether you get duplicates also depends on how the plugin handles commits (also a plugin-specific issue). Most of the plugins that load a large amount of data will commit on a periodic basis (e.g. every 1000 or 10000 entries or rows), so that if a crash occurs at 5500 entries, for example, you would end up with 5000 in the database, assuming a commit frequency of 1000. And it also depends whether the plugin checks for the presence of entries/rows before loading duplicates (a facility that may provided support for, but is not equivalent to, the ability to restart a plugin on the same input files.) > Situation here: > Due to limited tablespace we could successfully load 18079 rows in the > database (dots.ExternalNASequence and Dots.NAEntry). I am adding more > tablespace and then re-run the GBParser on the same GenBank file. At the > least I expect the primary key failure error for first 18079 rows and > then GBParser should be able to load the remaining ones. In general you are unlikely to get primary key errors, since the primary key values are autogenerated, and so the second time the plugin is run it will generate a whole new set of IDs (assuming that it has not been written to handle restarts and/or check whether entries being inserted are already in the database.) Again, however, it's something that is plugin-specific. If a table has additional "unique" constraints, for example, and the plugin fails to check whether inserted rows are already in the database, then it is possible for constraint violations to occur when re-running a plugin. Anyway, the bottom line is that it depends almost entirely on how the GBParser has been implemented, and so your questions are all best answered either by the people who wrote the plugin or by looking at the Perl code directly. Jonathan |
From: Steve F. <st...@pc...> - 2003-05-14 17:56:59
|
Joan- can you explain what a gene alias is as opposed to a gene synonym? thanks, steve Joan Mazzarelli wrote: > Hi all, > > I thought the point of this discussion was to figure out how to > integrate into the tables which contain (or were created to contain) > manual gene annotation assignments > the gene information which we get from MGI/Gene cards sequence > mappings. (although we may want to make recreate these tables for this > and/or if PSU has certain needs) . > > BTW, although a gene symbol is approved it can also change (MGI > versions for instance and also they have -pending), so this is another > case where changes can occur. > As it stands now, in the gene table we have the attribute gene_symbol > where the approved human or mouse gene symbol is written for each gene > when added by the annotator. > Also, in the gene table there is name, where I envisioned using the > new annotation tool, the approved gene name would be written. > approved gene_symbol = Fzd4 > approved gene name = frizzled homolog 4 (Drosophila) > > > https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::Gene&path=DoTS::Gene > > > Now for the Current dots.GeneSynonym table, the annotator can add gene > symbol synonyms for the gene and this is where they are written. > > https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneSynonym&path=DoTS::GeneSynonym > > > I created GeneAlias for other (not approved) gene names for a gene to > be used by the new annotation tool. > > https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneAlias&path=DoTS::GeneAlias > > > > For genes, they can have gene symbol synonyms and also gene name aliases. > It not necessarily the case where every gene symbol synonym has a gene > name alias which corresponds to it, as in the approved case above > or vice versa. > > > > > Joan > > > > Jonathan Crabtree wrote: > >> >> Steve Fischer wrote: >> >>> right now in GUS, we have a bunch of tables and attribute that >>> relate to gene symbols, names and aliases: >>> >>> Dots::Gene.name >>> Dots::Gene.gene_symbol >>> Dots::GeneAlias >>> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >>> intended to store references to external database entries. it is >>> hackish to encode in the schema that we assume that such entries are >>> gene records. they could easily be proteins or journals, whatever) >> >> >> >> Yes, this is definitely a hack; I added some columns to the DbRef table >> because I wanted to store 2-3 specific pieces of information for MGI and >> GeneCards entries, without creating another table. However, I disagree >> that I "encoded" in the schema the assumption that these DbRef entries >> are gene records; I think if you look more closely you will see that all >> of the newly-added columns (gene_symbol, chromosome, centimorgans) are >> NULLable. Therefore the only assumption I am making is that one or more >> of these columns *may* be applicable to certain DbRefs. >> >>> 1. introduce a GeneName table: >>> GeneName.gene_name_id >>> GeneName.name --- the full name >>> GeneName.symbol -- the symbol >>> >>> 2. introduce a GeneSynonym table: >>> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >>> GeneSynonym.name -- the full name of the synonym >>> GeneSynonym.symbol -- the symbol >> >> >> >> Arnaud's point that a gene may have names, but no approved name is a >> good >> one. It suggests that GeneSynonym should reference Gene, not GeneName. >> We might also consider renaming "GeneName" to "ApprovedGeneName" and >> "GeneSynonym" to "GeneName". Arnaud's second point, that there are >> potentially several different categories of names, suggests that we >> follow the example of the TaxonName table, and add a 'name_class' column >> to GeneSynonym. (This could also be a controlled vocabulary.) Then I >> think the only remaining question is whether we are sure that the only >> kinds of approved names we will ever have are "gene name" and "gene >> symbol". >> >> Jonathan >> >> >> >> ------------------------------------------------------- >> Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >> The only event dedicated to issues related to Linux enterprise solutions >> www.enterpriselinuxforum.com >> >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Chetna D. W. <ch...@ug...> - 2003-05-14 17:47:01
|
Hey all, I am working with Michael to load GenBank stuff in gus 3.0. Right our database is out of tablespace. My question is: Does repetitive use of plugin/GBParser resume where it left off or will it try to load everything from scratch. Situation here: Due to limited tablespace we could successfully load 18079 rows in the database (dots.ExternalNASequence and Dots.NAEntry). I am adding more tablespace and then re-run the GBParser on the same GenBank file. At the least I expect the primary key failure error for first 18079 rows and then GBParser should be able to load the remaining ones. It would be great if someone can give insight. Thanks, Chetna |
From: Joan M. <ma...@pc...> - 2003-05-14 17:20:08
|
Hi all, I thought the point of this discussion was to figure out how to integrate into the tables which contain (or were created to contain) manual gene annotation assignments the gene information which we get from MGI/Gene cards sequence mappings. (although we may want to make recreate these tables for this and/or if PSU has certain needs) . BTW, although a gene symbol is approved it can also change (MGI versions for instance and also they have -pending), so this is another case where changes can occur. As it stands now, in the gene table we have the attribute gene_symbol where the approved human or mouse gene symbol is written for each gene when added by the annotator. Also, in the gene table there is name, where I envisioned using the new annotation tool, the approved gene name would be written. approved gene_symbol = Fzd4 approved gene name = frizzled homolog 4 (Drosophila) https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::Gene&path=DoTS::Gene Now for the Current dots.GeneSynonym table, the annotator can add gene symbol synonyms for the gene and this is where they are written. https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneSynonym&path=DoTS::GeneSynonym I created GeneAlias for other (not approved) gene names for a gene to be used by the new annotation tool. https://www.cbil.upenn.edu/cgi-bin/dotsgenes-curator/schemaBrowser.pl?db=GUSdev&table=DoTS::GeneAlias&path=DoTS::GeneAlias For genes, they can have gene symbol synonyms and also gene name aliases. It not necessarily the case where every gene symbol synonym has a gene name alias which corresponds to it, as in the approved case above or vice versa. Joan Jonathan Crabtree wrote: > > Steve Fischer wrote: > >> right now in GUS, we have a bunch of tables and attribute that relate >> to gene symbols, names and aliases: >> >> Dots::Gene.name >> Dots::Gene.gene_symbol >> Dots::GeneAlias >> Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is >> intended to store references to external database entries. it is >> hackish to encode in the schema that we assume that such entries are >> gene records. they could easily be proteins or journals, whatever) > > > Yes, this is definitely a hack; I added some columns to the DbRef table > because I wanted to store 2-3 specific pieces of information for MGI and > GeneCards entries, without creating another table. However, I disagree > that I "encoded" in the schema the assumption that these DbRef entries > are gene records; I think if you look more closely you will see that all > of the newly-added columns (gene_symbol, chromosome, centimorgans) are > NULLable. Therefore the only assumption I am making is that one or more > of these columns *may* be applicable to certain DbRefs. > >> 1. introduce a GeneName table: >> GeneName.gene_name_id >> GeneName.name --- the full name >> GeneName.symbol -- the symbol >> >> 2. introduce a GeneSynonym table: >> GeneSynonym.gene_name_id -- the GeneName it is a synonym for >> GeneSynonym.name -- the full name of the synonym >> GeneSynonym.symbol -- the symbol > > > Arnaud's point that a gene may have names, but no approved name is a good > one. It suggests that GeneSynonym should reference Gene, not GeneName. > We might also consider renaming "GeneName" to "ApprovedGeneName" and > "GeneSynonym" to "GeneName". Arnaud's second point, that there are > potentially several different categories of names, suggests that we > follow the example of the TaxonName table, and add a 'name_class' column > to GeneSynonym. (This could also be a controlled vocabulary.) Then I > think the only remaining question is whether we are sure that the only > kinds of approved names we will ever have are "gene name" and "gene > symbol". > > Jonathan > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Jonathan C. <cra...@pc...> - 2003-05-14 16:11:21
|
Steve Fischer wrote: > right now in GUS, we have a bunch of tables and attribute that relate to > gene symbols, names and aliases: > > Dots::Gene.name > Dots::Gene.gene_symbol > Dots::GeneAlias > Sres::DbRef.gene_symbol (this is pretty clearly a hack. DbRef is > intended to store references to external database entries. it is > hackish to encode in the schema that we assume that such entries are > gene records. they could easily be proteins or journals, whatever) Yes, this is definitely a hack; I added some columns to the DbRef table because I wanted to store 2-3 specific pieces of information for MGI and GeneCards entries, without creating another table. However, I disagree that I "encoded" in the schema the assumption that these DbRef entries are gene records; I think if you look more closely you will see that all of the newly-added columns (gene_symbol, chromosome, centimorgans) are NULLable. Therefore the only assumption I am making is that one or more of these columns *may* be applicable to certain DbRefs. > 1. introduce a GeneName table: > GeneName.gene_name_id > GeneName.name --- the full name > GeneName.symbol -- the symbol > > 2. introduce a GeneSynonym table: > GeneSynonym.gene_name_id -- the GeneName it is a synonym for > GeneSynonym.name -- the full name of the synonym > GeneSynonym.symbol -- the symbol Arnaud's point that a gene may have names, but no approved name is a good one. It suggests that GeneSynonym should reference Gene, not GeneName. We might also consider renaming "GeneName" to "ApprovedGeneName" and "GeneSynonym" to "GeneName". Arnaud's second point, that there are potentially several different categories of names, suggests that we follow the example of the TaxonName table, and add a 'name_class' column to GeneSynonym. (This could also be a controlled vocabulary.) Then I think the only remaining question is whether we are sure that the only kinds of approved names we will ever have are "gene name" and "gene symbol". Jonathan |
From: Jonathan C. <cra...@pc...> - 2003-05-14 14:48:36
|
Michael, Steve- MICHAEL LUCHTAN wrote: > Here is the first occurence of the dbdbihandle error: > Inserting: > Table: NAEntry > created_date: [2003-5-03 00:00:00] <-------- > row_user_id: [6] This is definitely a date formatting problem, and this is also a known problem, but one that I thought had been fixed a while ago. We use a nonstandard default date format (NLS_DATE_FORMAT) in our Oracle instance, namely 'YYYY-MM-DD HH24:MI:SS'. I believe the default Oracle date format is 'DD--MON-RR' instead. Clearly the Perl object layer is generating date literals of the form 'YYYY-MM-DD HH24:MI:SS'; since this does not match the default format of your Oracle instance (your NLS_DATE_FORMAT), an error results. I described a simple fix for this problem when it was first identified, and I thought it had been applied to the code, but perhaps there was some complication that I'm forgetting. Anyway, one solution is to continue generating dates in a very specific format, but to use the Oracle TO_DATE function to explicitly tell Oracle what format you're using, rather than relying on the implicit string -> date conversion (which relies on the NLS_DATE_FORMAT). So for example, instead of doing the following: insert into sample_table1 values ('2003-5-03 00:00:00'); You'd use the following SQL instead: insert into sample_table1 values (to_date('2003-5-03 00:00:00', 'YYYY-MM-DD HH24:MI:SS')); In the object layer this would probably mean replacing "?" in the insert/update statement with "to_date(?, 'YYYY-MM-DD HH24:MI:SS')". It may be that the object layer doesn't distinguish between DATEs and strings for the purposes of quoting, which might make this nontrivial to implement. If so then an alternative would be to have the object layer run the following command at the beginning of each session: alter session set NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS'; Jonathan |