You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(34) |
Aug
(14) |
Sep
(10) |
Oct
(10) |
Nov
(11) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(56) |
Feb
(76) |
Mar
(68) |
Apr
(11) |
May
(97) |
Jun
(16) |
Jul
(29) |
Aug
(35) |
Sep
(18) |
Oct
(32) |
Nov
(23) |
Dec
(77) |
2004 |
Jan
(52) |
Feb
(44) |
Mar
(55) |
Apr
(38) |
May
(106) |
Jun
(82) |
Jul
(76) |
Aug
(47) |
Sep
(36) |
Oct
(56) |
Nov
(46) |
Dec
(61) |
2005 |
Jan
(52) |
Feb
(118) |
Mar
(41) |
Apr
(40) |
May
(35) |
Jun
(99) |
Jul
(84) |
Aug
(104) |
Sep
(53) |
Oct
(107) |
Nov
(68) |
Dec
(30) |
2006 |
Jan
(19) |
Feb
(27) |
Mar
(24) |
Apr
(9) |
May
(22) |
Jun
(11) |
Jul
(34) |
Aug
(8) |
Sep
(15) |
Oct
(55) |
Nov
(16) |
Dec
(2) |
2007 |
Jan
(12) |
Feb
(4) |
Mar
(8) |
Apr
|
May
(19) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
(12) |
Oct
(3) |
Nov
|
Dec
|
2008 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(21) |
2009 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(19) |
Jun
(14) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(22) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Elisabetta M. <man...@pc...> - 2004-11-09 04:53:53
|
Hi Sucheta, first a reminder, also for the benefit of others in the cc of this email, that general information on the RAD pipeline and on how to set RAD up is at http://www.cbil.upenn.edu/RAD/php/RAD-installation.php? In terms of the RAD schema, I'm not sure of exactly what you are looking for. RAD is part of GUS, so you if you install the GUS schema (instructions at http://www.gusdb.org/documentation.html) you'll have the RAD schema as well. I believe that the plans are to make available the next release of the schema (3.5?) in the next few weeks. The documentation for the RAD schema is contained in that of the GUS schema at http://www.gusdb.org/cgi-bin/schemaBrowser. All RAD plugins are available from the Sanger CVS and documentation can also be viewed at: http://www.gusdb.org/documentation/plugins. In terms of sample files, configuration sample files for most of the plugins are currently stored in the CBIL RAD cvs repository. This repository will be made public (read-only) very shortly, possibly even tomorrow, and will be added to the repositories currently viewable at http://cvs.cbil.upenn.edu. So, if you check back at this website in a few days you should see our RAD repository. All available sample files (mostly samples of cfg files) are in the directory /DataLoad/config/ of this repository and the name of the relevant cfg sample file for each plugin is are specified in the plugin's documentation. We don't currently store sample data files in cvs, but the format is typically explained in the plugin doc and if you need any further clarification feel free to contact us with the specifics of the plugin you are interested in and we can provide additional info and examples. Hope this helps, Elisabetta --- On Mon, 8 Nov 2004, Sucheta Tripathy wrote: > Hi Elisabetta, > > How are you? It was nice visiting you all at CBIL. > > I have one group here working on a pipeline to upload data to the RAD > module. For that I was wondering if you can point us a place, where we get > the RAD schema and also sample data files for all the plugins for RAD. > > That will be a great help. > > Thanks in advance. > > Have a great day. > > Sucheta > > > At 11:47 AM 9/15/2003 -0400, Elisabetta Manduchi wrote: > >> I've cvs committed a change to GUS/PluginMgr/lib/perl/Plugin.pm, >> suggested >> by Steve, to fix a problem occurring when registering a new plugin >> written >> using the most recent documentation features. >> Elisabetta >> >> >> >> ------------------------------------------------------- >> This sf.net email is sponsored by:ThinkGeek >> Welcome to geek heaven. >> http://thinkgeek.com/sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Elisabetta M. <man...@pc...> - 2004-11-08 20:49:08
|
Sorry, **important correction** regarding sample cfg files. Ignore my the second to last comment on the CBIL RAD cvs for config files in my previous email. The CBIL RAD repository will be made indeed public and will contain other useful things, like the latest StudyAnnotator code, etc. But the sample cfg files are stored at **the Sanger GUS cvs** in the repository GUS/RAD/config and *already available* (there are config files in the CBIL RAD repository I cited below, but those are old and should not be used.) Again, the documentation of each plugin should point to the location of its sample config file *within the Sanger GUS cvs repository*. Sorry for the confusion on this point, I got mixed up on our various cvs repositories... Elisabetta --- On Mon, 8 Nov 2004, Elisabetta Manduchi wrote: > > Hi Sucheta, > first a reminder, also for the benefit of others in the cc of this email, > that general information on the RAD pipeline and on how to set RAD up is at > http://www.cbil.upenn.edu/RAD/php/RAD-installation.php? > In terms of the RAD schema, I'm not sure of exactly what you are looking > for. RAD is part of GUS, so you if you install the GUS schema (instructions > at http://www.gusdb.org/documentation.html) you'll have the RAD schema as > well. I believe that the plans are to make available the next release of > the schema (3.5?) in the next few weeks. > The documentation for the RAD schema is contained in that of the GUS schema > at http://www.gusdb.org/cgi-bin/schemaBrowser. > All RAD plugins are available from the Sanger CVS and documentation can > also be viewed at: http://www.gusdb.org/documentation/plugins. > In terms of sample files, configuration sample files for most of the > plugins are currently stored in the CBIL RAD cvs repository. This > repository will be made public (read-only) very shortly, possibly even > tomorrow, and will be added to the repositories currently viewable at > http://cvs.cbil.upenn.edu. So, if you check back at this website in a few > days you should see our RAD repository. All available sample files (mostly > samples of cfg files) are in the directory /DataLoad/config/ of this > repository and the name of the relevant cfg sample file for each plugin is > are specified in the plugin's documentation. > We don't currently store sample data files in cvs, but the format is > typically explained in the plugin doc and if you need any further > clarification feel free to contact us with the specifics of the plugin you > are interested in and we can provide additional info and examples. > Hope this helps, > Elisabetta > > --- > > On Mon, 8 Nov 2004, Sucheta Tripathy wrote: > >> Hi Elisabetta, >> >> How are you? It was nice visiting you all at CBIL. >> >> I have one group here working on a pipeline to upload data to the RAD >> module. For that I was wondering if you can point us a place, where we >> get the RAD schema and also sample data files for all the plugins for >> RAD. >> >> That will be a great help. >> >> Thanks in advance. >> >> Have a great day. >> >> Sucheta >> >> >> At 11:47 AM 9/15/2003 -0400, Elisabetta Manduchi wrote: >> >>> I've cvs committed a change to GUS/PluginMgr/lib/perl/Plugin.pm, >>> suggested >>> by Steve, to fix a problem occurring when registering a new plugin >>> written >>> using the most recent documentation features. >>> Elisabetta >>> >>> >>> >>> ------------------------------------------------------- >>> This sf.net email is sponsored by:ThinkGeek >>> Welcome to geek heaven. >>> http://thinkgeek.com/sf >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > |
From: Fang, B. <fa...@vt...> - 2004-11-08 20:04:13
|
I just followed the steps of Installing the WDK in your WDK Wiki. When I ran the "build" file. The result came as following: ant -f /usr/local/apache2/htdocs/fbing/WDK/install/build.xml install -Dproj=WDK -DtargetDir=/usr/local/apache2/htdocs/fbing/gushome -Dcomp= -DprojectsDir=/usr/local/apache2/htdocs/fbing/WDK -Dappend=true -logger org.apache.tools.ant.NoBannerLogger | grep ']' [echo] . [echo] Installing WDK/Model [javac] Compiling 18 source files to /usr/local/apache2/htdocs/fbing/WDK/WDK/Model/classes [javac] /usr/local/apache2/htdocs/fbing/WDK/WDK/Model/src/java/org/gusdb/wdk/model/imp lementation/Oracle.java:136: package oracle.jdbc.driver does not exist [javac] DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver()); [javac] ^ [javac] /usr/local/apache2/htdocs/fbing/WDK/WDK/Model/src/java/org/gusdb/wdk/model/imp lementation/PostgreSQL.java:138: package oracle.jdbc.driver does not exist [javac] DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver()); [javac] ^ [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors BUILD FAILED /usr/local/apache2/htdocs/fbing/WDK/install/build.xml:26: The following error occurred while executing this line: /usr/local/apache2/htdocs/fbing/WDK/WDK/build.xml:45: The following error occurred while executing this line: /usr/local/apache2/htdocs/fbing/WDK/install/build.xml:222: The following error occurred while executing this line: /usr/local/apache2/htdocs/fbing/WDK/install/build.xml:241: Compile failed; see the compiler error output for details. Total time: 8 seconds I am not sure about why this happened. I need your help. Thanks a lot. Bing |
From: Sucheta T. <su...@vb...> - 2004-11-08 19:57:05
|
Hi Elisabetta, How are you? It was nice visiting you all at CBIL. I have one group here working on a pipeline to upload data to the RAD module. For that I was wondering if you can point us a place, where we get the RAD schema and also sample data files for all the plugins for RAD. That will be a great help. Thanks in advance. Have a great day. Sucheta At 11:47 AM 9/15/2003 -0400, Elisabetta Manduchi wrote: >I've cvs committed a change to GUS/PluginMgr/lib/perl/Plugin.pm, suggested >by Steve, to fix a problem occurring when registering a new plugin written >using the most recent documentation features. >Elisabetta > > > >------------------------------------------------------- >This sf.net email is sponsored by:ThinkGeek >Welcome to geek heaven. >http://thinkgeek.com/sf >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: <pi...@pc...> - 2004-11-08 19:45:13
|
Hi Ed, Having some difficult scheduling problems today but I (& Steve) will dedicate time to this tomorrow or later today. -Debbie Quoting Ed Robinson <ed_...@be...>: > The first step in writing such an adapter needs to be a document, though, > which shows what fields, in what formats go where in GUS. One of the main > problems with the parsers is that they have been developed without a common > document saying what kind of information goes where. > > To this end, I have a simple analysis of where our TCruzi and Crypto data are > being loaded by the different parsers. I am attaching copies of these two > brief documents in MS Word format.Presently this analysis is in Open Office > format. I would like to use these to start developing a data-destination > document that we can use as a standard for all further parser development. > > Also, I am not sure that this solution is really necessary for GFF Format. > Writing a GFF adapter involves two steps 1. Querying the data and 2. Passing > it correctly to BioPerl. The solution we have so far is simply to put the > formating information in the SQL query (it's one step). Of course this is a > solution that is ignorant of the GUS object model. It would be nice to embedd > this process in an object which maps from GUS objects to BioPERL for a number > of reasons. But I also think it might be something to put off until later > since the formatted SQL query is a quick-and-dirty time saver. > > -Ed > > > > > > > > From: Steve Fischer <sfi...@pc...> > > Date: 2004/11/05 Fri AM 11:25:32 EST > > To: gusdev-gusdev <gus...@li...>, > > "Aaron J. Mackey" <am...@pc...> > > Subject: [Gusdev-gusdev] GUS & bioperl > > > > folks- > > > > We should immediately explore a GUS <--> bioperl adaptor. > > > > we would use it for: > > - Genbank and TIGR XML -> GUS > > - GUS -> GBrowse > > - possibly GUS -> Chado > > > > Here is what Aaron has to say about parsing Genbank, etc: > > > > Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and > > represents all of it (split between Bio::Seq [sequence, id, accession, > > etc], Bio::SeqFeature [everything found in the feature table] and > > Bio::Annotation [comments, references, etc] objects). Similar parsers > > exist for practically all common sequence formats (including TIGR-XML > > and other genome annotation-relevant formats). > > > > Here is what Aaron has to say about GBrowse: > > > > IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools > > for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and > > Bio::Feature::IO in the latest development release (due out any day > > now, as soon as I stop reading my email; for now, you can get it from > > CVS). > > > > To use these tools, you build Bio::SeqFeature objects that represent > > the items you wish to dump as GFF; thus you can build complicated > > hierarchies of gene models, exons, CDS, UTR, etc, adding deeply > > structured attributes/annotations to each, and let the BioPerl GFF code > > figure out how to represent it (in GFF2 or GFF3) so that other tools > > (including Gbrowse) can read it. > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: > > Sybase ASE Linux Express Edition - download now for FREE > > LinuxWorld Reader's Choice Award Winner for best database on Linux. > > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > Ed Robinson > 255 Deerfield Rd > Bogart, GA 30622 > (706)425-9181 > Sweet Caroline > > good times never seemed so good. > I've been inclined > to believe they never would. > --Neil Diamond > > > We're just a bunch of idiots. > --Johnny Damon > |
From: Ed R. <ed_...@be...> - 2004-11-05 21:01:45
|
>>>We put Bioperl data into GUS and we've > retrospectively documented what goes where so all developers understand > how things work. It also highlights anything that is "missing". Can you post a copy of any documents you have to the list!?!? This is the main problem we are having right now, there isn't an agreed upon, documented mapping of what goes where in GUS. Your document would be great for starting a discussion to create such a standard document. I earlier attached a comparrison of where the GBParser and the TIGRXMLParser put our data. Let me know if I should send it again. I assume you are all using an EMBL based parser. -ed > > From: Paul Mooney <pj...@sa...> > Date: 2004/11/05 Fri PM 03:46:43 EST > To: Steve Fischer <sfi...@pc...> > CC: Ed Robinson <ed_...@be...>, > gusdev-gusdev <gus...@li...>, > "Aaron J. Mackey" <am...@pc...> > Subject: Re: [Gusdev-gusdev] GUS & bioperl > > > On 5 Nov 2004, at 17:40, Steve Fischer wrote: > > > about gbrowse. it is good that Haiming has put together a prototype > > for loading gus data into gbrowse. > > > > but, as Aaron points out, we will likely be putting sophisticated > > data into gbrowse. i would rather start on a strong foundation than > > invest resources into a solution that we will grow out of. > > > > it should not be hard to put gus data into bioperl. > > > > We would welcome GUS to Bioperl software :) > > Ed has a point: mapping GUS objects to bioperl objects and back again > needs some thought. > > I hope GFF output has improved in the latest CVS version of Bioperl, > the stable 1.4 version was not up to scratch for me so I just wrote my > own :( > > > > steve > > > > Ed Robinson wrote: > > > > The first step in writing such an adapter needs to be a document, > > though, which shows what fields, in what formats go where in GUS. One > > of the main problems with the parsers is that they have been developed > > without a common document saying what kind of information goes where. > > > > To this end, I have a simple analysis of where our TCruzi and Crypto > > data are being loaded by the different parsers. I am attaching copies > > of these two brief documents in MS Word format.Presently this analysis > > is in Open Office format. I would like to use these to start > > developing a data-destination document that we can use as a standard > > for all further parser development. > > > > Also, I am not sure that this solution is really necessary for GFF > > Format. Writing a GFF adapter involves two steps 1. Querying the data > > and 2. Passing it correctly to BioPerl. The solution we have so far > > is simply to put the formating information in the SQL query (it's one > > step). Of course this is a solution that is ignorant of the GUS > > object model. It would be nice to embedd this process in an object > > which maps from GUS objects to BioPERL for a number of reasons. But I > > also think it might be something to put off until later since the > > formatted SQL query is a quick-and-dirty time saver. > > > > -Ed > > > > > > > > > > > > From: Steve Fischer <sfi...@pc...> > > Date: 2004/11/05 Fri AM 11:25:32 EST > > To: gusdev-gusdev <gus...@li...>, > > "Aaron J. Mackey" <am...@pc...> > > Subject: [Gusdev-gusdev] GUS & bioperl > > > > folks- > > > > We should immediately explore a GUS <--> bioperl adaptor. > > > > we would use it for: > > - Genbank and TIGR XML -> GUS > > - GUS -> GBrowse > > - possibly GUS -> Chado > > > > Here is what Aaron has to say about parsing Genbank, etc: > > > > Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and > > represents all of it (split between Bio::Seq [sequence, id, accession, > > etc], Bio::SeqFeature [everything found in the feature table] and > > Bio::Annotation [comments, references, etc] objects). Similar parsers > > exist for practically all common sequence formats (including TIGR-XML > > and other genome annotation-relevant formats). > > > > Here is what Aaron has to say about GBrowse: > > > > IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools > > for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and > > Bio::Feature::IO in the latest development release (due out any day > > now, as soon as I stop reading my email; for now, you can get it from > > CVS). > > > > To use these tools, you build Bio::SeqFeature objects that represent > > the items you wish to dump as GFF; thus you can build complicated > > hierarchies of gene models, exons, CDS, UTR, etc, adding deeply > > structured attributes/annotations to each, and let the BioPerl GFF code > > figure out how to represent it (in GFF2 or GFF3) so that other tools > > (including Gbrowse) can read it. > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: > > Sybase ASE Linux Express Edition - download now for FREE > > LinuxWorld Reader's Choice Award Winner for best database on Linux. > > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > Ed Robinson > > 255 Deerfield Rd > > Bogart, GA 30622 > > (706)425-9181 > > Sweet Caroline > > > > good times never seemed so good. > > I've been inclined > > to believe they never would. > > --Neil Diamond > > > > > > We're just a bunch of idiots. > > --Johnny Damon > > Ed Robinson 255 Deerfield Rd Bogart, GA 30622 (706)425-9181 Sweet Caroline good times never seemed so good. I've been inclined to believe they never would. --Neil Diamond We're just a bunch of idiots. --Johnny Damon |
From: Paul M. <pj...@sa...> - 2004-11-05 20:46:43
|
On 5 Nov 2004, at 17:40, Steve Fischer wrote: > about gbrowse.=A0=A0 it is good that Haiming has put together a = prototype=20 > for loading gus data into gbrowse. > > but, as Aaron points out, we will likely be putting sophisticated=20 > data into gbrowse.=A0=A0 i would rather start on a strong foundation = than=20 > invest resources into a solution that we will grow out of.=A0=A0 > > it should not be hard to put gus data into bioperl. > We would welcome GUS to Bioperl software :) Ed has a point: mapping GUS objects to bioperl objects and back again=20 needs some thought. We put Bioperl data into GUS and we've=20 retrospectively documented what goes where so all developers understand=20= how things work. It also highlights anything that is "missing". I hope GFF output has improved in the latest CVS version of Bioperl,=20 the stable 1.4 version was not up to scratch for me so I just wrote my=20= own :( > steve > > Ed Robinson wrote: > > The first step in writing such an adapter needs to be a document,=20 > though, which shows what fields, in what formats go where in GUS. One=20= > of the main problems with the parsers is that they have been developed=20= > without a common document saying what kind of information goes where. > > To this end, I have a simple analysis of where our TCruzi and Crypto=20= > data are being loaded by the different parsers. I am attaching copies=20= > of these two brief documents in MS Word format.Presently this analysis=20= > is in Open Office format. I would like to use these to start=20 > developing a data-destination document that we can use as a standard=20= > for all further parser development. > > Also, I am not sure that this solution is really necessary for GFF=20 > Format. Writing a GFF adapter involves two steps 1. Querying the data=20= > and 2. Passing it correctly to BioPerl. The solution we have so far=20= > is simply to put the formating information in the SQL query (it's one=20= > step). Of course this is a solution that is ignorant of the GUS=20 > object model. It would be nice to embedd this process in an object=20 > which maps from GUS objects to BioPERL for a number of reasons. But I=20= > also think it might be something to put off until later since the=20 > formatted SQL query is a quick-and-dirty time saver. > > -Ed > > > > > > From: Steve Fischer <sfi...@pc...> > Date: 2004/11/05 Fri AM 11:25:32 EST > To: gusdev-gusdev <gus...@li...>, > "Aaron J. Mackey" <am...@pc...> > Subject: [Gusdev-gusdev] GUS & bioperl > > folks- > > We should immediately explore a GUS <--> bioperl adaptor. > > we would use it for: > - Genbank and TIGR XML -> GUS > - GUS -> GBrowse > - possibly GUS -> Chado > > Here is what Aaron has to say about parsing Genbank, etc: > > Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and > represents all of it (split between Bio::Seq [sequence, id, accession, > etc], Bio::SeqFeature [everything found in the feature table] and > Bio::Annotation [comments, references, etc] objects). Similar parsers > exist for practically all common sequence formats (including TIGR-XML > and other genome annotation-relevant formats). > > Here is what Aaron has to say about GBrowse: > > IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools > for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and > Bio::Feature::IO in the latest development release (due out any day > now, as soon as I stop reading my email; for now, you can get it from > CVS). > > To use these tools, you build Bio::SeqFeature objects that represent > the items you wish to dump as GFF; thus you can build complicated > hierarchies of gene models, exons, CDS, UTR, etc, adding deeply > structured attributes/annotations to each, and let the BioPerl GFF = code > figure out how to represent it (in GFF2 or GFF3) so that other tools > (including Gbrowse) can read it. > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=3D5588&alloc_id=3D12065&op=3Dclick > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > Ed Robinson > 255 Deerfield Rd > Bogart, GA 30622 > (706)425-9181 > Sweet Caroline > > good times never seemed so good. > I've been inclined > to believe they never would. > --Neil Diamond > > > We're just a bunch of idiots. > --Johnny Damon |
From: Steve F. <sfi...@pc...> - 2004-11-05 17:40:01
|
about gbrowse. it is good that Haiming has put together a prototype for loading gus data into gbrowse. but, as Aaron points out, we will likely be putting sophisticated data into gbrowse. i would rather start on a strong foundation than invest resources into a solution that we will grow out of. it should not be hard to put gus data into bioperl. steve Ed Robinson wrote: >The first step in writing such an adapter needs to be a document, though, which shows what fields, in what formats go where in GUS. One of the main problems with the parsers is that they have been developed without a common document saying what kind of information goes where. > >To this end, I have a simple analysis of where our TCruzi and Crypto data are being loaded by the different parsers. I am attaching copies of these two brief documents in MS Word format.Presently this analysis is in Open Office format. I would like to use these to start developing a data-destination document that we can use as a standard for all further parser development. > >Also, I am not sure that this solution is really necessary for GFF Format. Writing a GFF adapter involves two steps 1. Querying the data and 2. Passing it correctly to BioPerl. The solution we have so far is simply to put the formating information in the SQL query (it's one step). Of course this is a solution that is ignorant of the GUS object model. It would be nice to embedd this process in an object which maps from GUS objects to BioPERL for a number of reasons. But I also think it might be something to put off until later since the formatted SQL query is a quick-and-dirty time saver. > >-Ed > > > > > > >>From: Steve Fischer <sfi...@pc...> >>Date: 2004/11/05 Fri AM 11:25:32 EST >>To: gusdev-gusdev <gus...@li...>, >> "Aaron J. Mackey" <am...@pc...> >>Subject: [Gusdev-gusdev] GUS & bioperl >> >>folks- >> >>We should immediately explore a GUS <--> bioperl adaptor. >> >>we would use it for: >> - Genbank and TIGR XML -> GUS >> - GUS -> GBrowse >> - possibly GUS -> Chado >> >>Here is what Aaron has to say about parsing Genbank, etc: >> >>Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and >>represents all of it (split between Bio::Seq [sequence, id, accession, >>etc], Bio::SeqFeature [everything found in the feature table] and >>Bio::Annotation [comments, references, etc] objects). Similar parsers >>exist for practically all common sequence formats (including TIGR-XML >>and other genome annotation-relevant formats). >> >>Here is what Aaron has to say about GBrowse: >> >>IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools >>for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and >>Bio::Feature::IO in the latest development release (due out any day >>now, as soon as I stop reading my email; for now, you can get it from >>CVS). >> >>To use these tools, you build Bio::SeqFeature objects that represent >>the items you wish to dump as GFF; thus you can build complicated >>hierarchies of gene models, exons, CDS, UTR, etc, adding deeply >>structured attributes/annotations to each, and let the BioPerl GFF code >>figure out how to represent it (in GFF2 or GFF3) so that other tools >>(including Gbrowse) can read it. >> >> >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by: >>Sybase ASE Linux Express Edition - download now for FREE >>LinuxWorld Reader's Choice Award Winner for best database on Linux. >>http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click >>_______________________________________________ >>Gusdev-gusdev mailing list >>Gus...@li... >>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> >> > >Ed Robinson >255 Deerfield Rd >Bogart, GA 30622 >(706)425-9181 >Sweet Caroline > >good times never seemed so good. >I've been inclined >to believe they never would. > --Neil Diamond > > >We're just a bunch of idiots. > --Johnny Damon > |
From: Ed R. <ed_...@be...> - 2004-11-05 17:17:34
|
The first step in writing such an adapter needs to be a document, though, which shows what fields, in what formats go where in GUS. One of the main problems with the parsers is that they have been developed without a common document saying what kind of information goes where. To this end, I have a simple analysis of where our TCruzi and Crypto data are being loaded by the different parsers. I am attaching copies of these two brief documents in MS Word format.Presently this analysis is in Open Office format. I would like to use these to start developing a data-destination document that we can use as a standard for all further parser development. Also, I am not sure that this solution is really necessary for GFF Format. Writing a GFF adapter involves two steps 1. Querying the data and 2. Passing it correctly to BioPerl. The solution we have so far is simply to put the formating information in the SQL query (it's one step). Of course this is a solution that is ignorant of the GUS object model. It would be nice to embedd this process in an object which maps from GUS objects to BioPERL for a number of reasons. But I also think it might be something to put off until later since the formatted SQL query is a quick-and-dirty time saver. -Ed > > From: Steve Fischer <sfi...@pc...> > Date: 2004/11/05 Fri AM 11:25:32 EST > To: gusdev-gusdev <gus...@li...>, > "Aaron J. Mackey" <am...@pc...> > Subject: [Gusdev-gusdev] GUS & bioperl > > folks- > > We should immediately explore a GUS <--> bioperl adaptor. > > we would use it for: > - Genbank and TIGR XML -> GUS > - GUS -> GBrowse > - possibly GUS -> Chado > > Here is what Aaron has to say about parsing Genbank, etc: > > Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and > represents all of it (split between Bio::Seq [sequence, id, accession, > etc], Bio::SeqFeature [everything found in the feature table] and > Bio::Annotation [comments, references, etc] objects). Similar parsers > exist for practically all common sequence formats (including TIGR-XML > and other genome annotation-relevant formats). > > Here is what Aaron has to say about GBrowse: > > IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools > for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and > Bio::Feature::IO in the latest development release (due out any day > now, as soon as I stop reading my email; for now, you can get it from > CVS). > > To use these tools, you build Bio::SeqFeature objects that represent > the items you wish to dump as GFF; thus you can build complicated > hierarchies of gene models, exons, CDS, UTR, etc, adding deeply > structured attributes/annotations to each, and let the BioPerl GFF code > figure out how to represent it (in GFF2 or GFF3) so that other tools > (including Gbrowse) can read it. > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > Ed Robinson 255 Deerfield Rd Bogart, GA 30622 (706)425-9181 Sweet Caroline good times never seemed so good. I've been inclined to believe they never would. --Neil Diamond We're just a bunch of idiots. --Johnny Damon |
From: Steve F. <sfi...@pc...> - 2004-11-05 16:24:58
|
folks- We should immediately explore a GUS <--> bioperl adaptor. we would use it for: - Genbank and TIGR XML -> GUS - GUS -> GBrowse - possibly GUS -> Chado Here is what Aaron has to say about parsing Genbank, etc: Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and represents all of it (split between Bio::Seq [sequence, id, accession, etc], Bio::SeqFeature [everything found in the feature table] and Bio::Annotation [comments, references, etc] objects). Similar parsers exist for practically all common sequence formats (including TIGR-XML and other genome annotation-relevant formats). Here is what Aaron has to say about GBrowse: IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and Bio::Feature::IO in the latest development release (due out any day now, as soon as I stop reading my email; for now, you can get it from CVS). To use these tools, you build Bio::SeqFeature objects that represent the items you wish to dump as GFF; thus you can build complicated hierarchies of gene models, exons, CDS, UTR, etc, adding deeply structured attributes/annotations to each, and let the BioPerl GFF code figure out how to represent it (in GFF2 or GFF3) so that other tools (including Gbrowse) can read it. |
From: Michael S. <msa...@pc...> - 2004-11-05 15:37:14
|
core.tableinfo should be populated at the same time the database is created-- The rows are in $GUS_HOME/schema/oracle/core-TableInfo-rows.sql and this is called from the $GUS_HOME/schema/oracle/create-db.sh script. Best I know, this script must be called manually, but that's likely already been done if any GUS table already exists. --Mike Dave Barkan wrote: > Hey Federica, > > Can you do two things for me to help me discover the source of your > problem: > > % cd /opt/GUS/projects/GUS/Model/src/java/org/gusdb/model/DoTS > > and let me know what is in that directory. > > Also, in the database itself, run this query: > > 'select count(*) from core.tableinfo where name = 'BLATAlignment' > > and let me know if it returns 0 or 1. > > Other GUS folks; I think the problem is that Federica's TableInfo is > not populated, as the only objects that are getting compiled are the > hand-edited objects. The rest all depend on having entries in > TableInfo, and I don't think that is the case. I am not that familiar > with the GUS installation process; does TableInfo come prepopulated, or > is there a step one has to do in order to get it there? > > Dave > > On Fri, 5 Nov 2004 fed...@bi... wrote: > >> Hi! >> I'm new to linux and to Gus; I've got some problem whit the installation. >> I modified build.xml and sample.prop as the protocol suggets but while >> compiling GUS/Model component javac doesn't find BLATAlignement class. It >> tells me: >> Installing GUS/Model >> [copy] Copying 20 files to /opt/GUS/lib/perl/GUS/Model >> [javac] Compiling 3 source files to >> /opt/GUS/projects/GUS/Model/classes >> [javac] >> /opt/GUS/projects/GUS/Model/src/java/org/gusdb/model/DoTS/BLATAlignment.java:26: >> >> cannot resolve symbol >> [javac] symbol : class BLATAlignment_Row >> [javac] location: class org.gusdb.model.DoTS.BLATAlignment >> [javac] public class BLATAlignment extends BLATAlignment_Row { >> >> >> I got the same problems also whit other classes. >> >> Can anyone help me? >> Thank you very much, it's a long time I try to solve this problem! >> Federica >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by: >> Sybase ASE Linux Express Edition - download now for FREE >> LinuxWorld Reader's Choice Award Winner for best database on Linux. >> http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Dave B. <db...@pc...> - 2004-11-05 15:19:17
|
Hey Federica, Can you do two things for me to help me discover the source of your problem: % cd /opt/GUS/projects/GUS/Model/src/java/org/gusdb/model/DoTS and let me know what is in that directory. Also, in the database itself, run this query: 'select count(*) from core.tableinfo where name = 'BLATAlignment' and let me know if it returns 0 or 1. Other GUS folks; I think the problem is that Federica's TableInfo is not populated, as the only objects that are getting compiled are the hand-edited objects. The rest all depend on having entries in TableInfo, and I don't think that is the case. I am not that familiar with the GUS installation process; does TableInfo come prepopulated, or is there a step one has to do in order to get it there? Dave On Fri, 5 Nov 2004 fed...@bi... wrote: > Hi! > I'm new to linux and to Gus; I've got some problem whit the installation. > I modified build.xml and sample.prop as the protocol suggets but while > compiling GUS/Model component javac doesn't find BLATAlignement class. It > tells me: > Installing GUS/Model > [copy] Copying 20 files to /opt/GUS/lib/perl/GUS/Model > [javac] Compiling 3 source files to /opt/GUS/projects/GUS/Model/classes > [javac] > /opt/GUS/projects/GUS/Model/src/java/org/gusdb/model/DoTS/BLATAlignment.java:26: > cannot resolve symbol > [javac] symbol : class BLATAlignment_Row > [javac] location: class org.gusdb.model.DoTS.BLATAlignment > [javac] public class BLATAlignment extends BLATAlignment_Row { > > > I got the same problems also whit other classes. > > Can anyone help me? > Thank you very much, it's a long time I try to solve this problem! > Federica > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: <fed...@bi...> - 2004-11-05 14:17:38
|
Hi! I'm new to linux and to Gus; I've got some problem whit the installation. I modified build.xml and sample.prop as the protocol suggets but while compiling GUS/Model component javac doesn't find BLATAlignement class. It tells me: Installing GUS/Model [copy] Copying 20 files to /opt/GUS/lib/perl/GUS/Model [javac] Compiling 3 source files to /opt/GUS/projects/GUS/Model/classes [javac] /opt/GUS/projects/GUS/Model/src/java/org/gusdb/model/DoTS/BLATAlignment.java:26: cannot resolve symbol [javac] symbol : class BLATAlignment_Row [javac] location: class org.gusdb.model.DoTS.BLATAlignment [javac] public class BLATAlignment extends BLATAlignment_Row { I got the same problems also whit other classes. Can anyone help me? Thank you very much, it's a long time I try to solve this problem! Federica |
From: Dave B. <db...@pc...> - 2004-11-03 21:00:43
|
Hey folks, A follow-up on my email from last week suggesting adding a --commit flag to the pipeline API to handle committing plugins called by the pipeline. Since I have heard two people say this is a good idea, and no one say it is a bad idea, I am going to make this change. What this means for you is that after I put this into CVS (will do so soon), and you check it out, you will have to add the --commit flag to your command line in order to have any plugins in your pipeline run in commit mode. I will make this backward compatible in pipeline code by having the method 'runPluginNoCommit()' continue to run a plugin without committing the results, even if the --commit flag is set in the command line. But I would suggest that you go through and change 'runPluginNoCommit' in your pipelines to 'runPlugin' and use the --commit flag only to restrict committing results. I will let everyone know when this change is, um, committed to CVS. Dave On Tue, 26 Oct 2004, Dave Barkan wrote: > I was also thinking it would be nice to give pipelines another parameter (in > addition to the properties file parameter they have now.) The second one > would be the --commit flag. Right now committing plugins and scripts are > done in the pipeline code itself, and the user has to change the code when > they just want to test the pipeline and the plugins contained therein. > > If we had a commit flag on the command line instead of in the code, then we > could prevent having to change the code for testing, as well as force the > user to explicitly state they want to commit code, which guards against > accidentally running in commit mode. > > How does that sound? It would require some tweaking in the Manager > (currently we have the explicit methods available, 'runPlugin' and > 'runPluginNoCommit' which would have to be changed to take the flag given on > the command line), as well as in existing pipelines which are using the > Manager, but it would be only a small amount of work. > > Dave > |
From: Dave B. <db...@pc...> - 2004-10-26 19:15:16
|
I was also thinking it would be nice to give pipelines another parameter (in addition to the properties file parameter they have now.) The second one would be the --commit flag. Right now committing plugins and scripts are done in the pipeline code itself, and the user has to change the code when they just want to test the pipeline and the plugins contained therein. If we had a commit flag on the command line instead of in the code, then we could prevent having to change the code for testing, as well as force the user to explicitly state they want to commit code, which guards against accidentally running in commit mode. How does that sound? It would require some tweaking in the Manager (currently we have the explicit methods available, 'runPlugin' and 'runPluginNoCommit' which would have to be changed to take the flag given on the command line), as well as in existing pipelines which are using the Manager, but it would be only a small amount of work. Dave |
From: Dave B. <db...@pc...> - 2004-10-26 16:00:45
|
> Would it be possible for the "get new release" stage to store the > release ID it created somewhere for later access, such as in a file > whose name was specified in the config file? Hmm, maybe, and then the pipeline could just read in the release file when it starts up? That would definitely be one way to do it. However, then you essentially have two properties files, because I'm sure there will be times when you know which release you want to use and would have to set it in the database release file yourself. So it is a trade-off between that and having the pipeline stop and tell the user to manually update the properties file when it creates a new release. Both require minimum effort, but since these come up a lot, your idea is probably worth looking into. Dave > > John > > > On Tue, 2004-10-26 at 11:42, Dave Barkan wrote: >> Hey John--see below. >> >> On Tue, 26 Oct 2004, John Iodice wrote: >> >>> Adding a pipeline step for new external database release IDs is a great >>> idea. I have noticed in the past that when people insert a record >>> "manually" (that is, with UpdateGusFromXML or SubmitRow), they tend to >>> supply only the minimum info, to the extent of populating required >>> fields with the string of "unknown". I've been guilty of this myself. >>> Hopefully, records created programmatically will document the DB release >>> better. >> >> Me too, but what I had in mind wasn't explicit functionality to load new >> DB releases, just a generic step in the Manager that exits and prints out >> information to the user. I do remember writing a simple plugin that loads >> a new release into ExternalDatabaseRelease; it is in Common but I don't >> think it adds much over SubmitRow except to name ExternalDatabaseRelease >> attributes explicitly as parameters. >> >>> >>> Question: is there a way to avoid editing the properties file to include >>> the new release ID? >>> >>> In many cases, we want the latest release of the given external >>> database. Could we create a utility that takes an external database ID >>> and finds the ID of its latest release? >> >> The only thing I could think of is to query for the latest release for a >> particular database, but this is dangerous; there could well be new >> releases loaded that for one reason or another we don't want to use (I >> think that is the case with GO right now; there is a new release loaded >> since the last time we made rules against GO terms.) Maybe the latest >> release could be the default though, unless the user overrides it with a >> property. >> >> I'd be interested to hear Steve's plans for loading 3rd party data and >> whether they include handling entries in the XDBR table. >> >> Dave >> >> >> >>> >>> If we could do something like that, it would not only simplify a manual >>> step, it would completely automate it. >>> >>> On Mon, 2004-10-25 at 17:39, Dave Barkan wrote: >>>> Hey all, >>>> >>>> I noticed something that is a bit of a pain in the neck when running our >>>> pipeline; whenever we load external data, we have to make sure that there >>>> is a new entry in the ExternalDatabaseRelease table for that data. The >>>> way I've always handled this is to make those by hand before the pipeline >>>> runs and set the database release IDs in the pipeline properties file. >>>> >>>> I think a better way would be to have a step in the Pipeline that does it >>>> for you for each database release you have to load. Values for the >>>> database release entry could either go in the properties file or be parsed >>>> from the file you are loading depending on availability. >>>> >>>> This is easy enough to implement, but often we need to use the database >>>> release ids later in the pipeline. There isn't any way to automatically >>>> set these as internal properties; we really need to add them by hand to >>>> the pipeline properties file. >>>> >>>> I am going to add a method to Manager.pm that the pipeline can call named >>>> "waitForUser" (or something similar). It will just exit the pipeline with >>>> a message, and in this case it can say "Please set the following property >>>> in the pipeline properties file: flyDB=XXXX". It is sort of a generic >>>> implementation of the 'exitToLiniac' method that we have already. Then >>>> the user can set this property and start the pipeline again. >>>> >>>> Dave >>>> >>>> >>>> ------------------------------------------------------- >>>> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >>>> Use IT products in your business? Tell us what you think of them. Give us >>>> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >>>> http://productguide.itmanagersjournal.com/guidepromo.tmpl >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >>> Use IT products in your business? Tell us what you think of them. Give us >>> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >>> http://productguide.itmanagersjournal.com/guidepromo.tmpl >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >> Use IT products in your business? Tell us what you think of them. Give us >> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >> http://productguide.itmanagersjournal.com/guidepromo.tmpl >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: John I. <io...@pc...> - 2004-10-26 15:55:26
|
Dave, That's a good point -- suppose Debbie is partway through the mouse build and I start ApiDots; she wouldn't want to switch Prodom releases in mid-build. Would it be possible for the "get new release" stage to store the release ID it created somewhere for later access, such as in a file whose name was specified in the config file? John On Tue, 2004-10-26 at 11:42, Dave Barkan wrote: > Hey John--see below. > > On Tue, 26 Oct 2004, John Iodice wrote: > > > Adding a pipeline step for new external database release IDs is a great > > idea. I have noticed in the past that when people insert a record > > "manually" (that is, with UpdateGusFromXML or SubmitRow), they tend to > > supply only the minimum info, to the extent of populating required > > fields with the string of "unknown". I've been guilty of this myself. > > Hopefully, records created programmatically will document the DB release > > better. > > Me too, but what I had in mind wasn't explicit functionality to load new > DB releases, just a generic step in the Manager that exits and prints out > information to the user. I do remember writing a simple plugin that loads > a new release into ExternalDatabaseRelease; it is in Common but I don't > think it adds much over SubmitRow except to name ExternalDatabaseRelease > attributes explicitly as parameters. > > > > > Question: is there a way to avoid editing the properties file to include > > the new release ID? > > > > In many cases, we want the latest release of the given external > > database. Could we create a utility that takes an external database ID > > and finds the ID of its latest release? > > The only thing I could think of is to query for the latest release for a > particular database, but this is dangerous; there could well be new > releases loaded that for one reason or another we don't want to use (I > think that is the case with GO right now; there is a new release loaded > since the last time we made rules against GO terms.) Maybe the latest > release could be the default though, unless the user overrides it with a > property. > > I'd be interested to hear Steve's plans for loading 3rd party data and > whether they include handling entries in the XDBR table. > > Dave > > > > > > > If we could do something like that, it would not only simplify a manual > > step, it would completely automate it. > > > > On Mon, 2004-10-25 at 17:39, Dave Barkan wrote: > >> Hey all, > >> > >> I noticed something that is a bit of a pain in the neck when running our > >> pipeline; whenever we load external data, we have to make sure that there > >> is a new entry in the ExternalDatabaseRelease table for that data. The > >> way I've always handled this is to make those by hand before the pipeline > >> runs and set the database release IDs in the pipeline properties file. > >> > >> I think a better way would be to have a step in the Pipeline that does it > >> for you for each database release you have to load. Values for the > >> database release entry could either go in the properties file or be parsed > >> from the file you are loading depending on availability. > >> > >> This is easy enough to implement, but often we need to use the database > >> release ids later in the pipeline. There isn't any way to automatically > >> set these as internal properties; we really need to add them by hand to > >> the pipeline properties file. > >> > >> I am going to add a method to Manager.pm that the pipeline can call named > >> "waitForUser" (or something similar). It will just exit the pipeline with > >> a message, and in this case it can say "Please set the following property > >> in the pipeline properties file: flyDB=XXXX". It is sort of a generic > >> implementation of the 'exitToLiniac' method that we have already. Then > >> the user can set this property and start the pipeline again. > >> > >> Dave > >> > >> > >> ------------------------------------------------------- > >> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > >> Use IT products in your business? Tell us what you think of them. Give us > >> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > >> http://productguide.itmanagersjournal.com/guidepromo.tmpl > >> _______________________________________________ > >> Gusdev-gusdev mailing list > >> Gus...@li... > >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Dave B. <db...@pc...> - 2004-10-26 15:42:43
|
Hey John--see below. On Tue, 26 Oct 2004, John Iodice wrote: > Adding a pipeline step for new external database release IDs is a great > idea. I have noticed in the past that when people insert a record > "manually" (that is, with UpdateGusFromXML or SubmitRow), they tend to > supply only the minimum info, to the extent of populating required > fields with the string of "unknown". I've been guilty of this myself. > Hopefully, records created programmatically will document the DB release > better. Me too, but what I had in mind wasn't explicit functionality to load new DB releases, just a generic step in the Manager that exits and prints out information to the user. I do remember writing a simple plugin that loads a new release into ExternalDatabaseRelease; it is in Common but I don't think it adds much over SubmitRow except to name ExternalDatabaseRelease attributes explicitly as parameters. > > Question: is there a way to avoid editing the properties file to include > the new release ID? > > In many cases, we want the latest release of the given external > database. Could we create a utility that takes an external database ID > and finds the ID of its latest release? The only thing I could think of is to query for the latest release for a particular database, but this is dangerous; there could well be new releases loaded that for one reason or another we don't want to use (I think that is the case with GO right now; there is a new release loaded since the last time we made rules against GO terms.) Maybe the latest release could be the default though, unless the user overrides it with a property. I'd be interested to hear Steve's plans for loading 3rd party data and whether they include handling entries in the XDBR table. Dave > > If we could do something like that, it would not only simplify a manual > step, it would completely automate it. > > On Mon, 2004-10-25 at 17:39, Dave Barkan wrote: >> Hey all, >> >> I noticed something that is a bit of a pain in the neck when running our >> pipeline; whenever we load external data, we have to make sure that there >> is a new entry in the ExternalDatabaseRelease table for that data. The >> way I've always handled this is to make those by hand before the pipeline >> runs and set the database release IDs in the pipeline properties file. >> >> I think a better way would be to have a step in the Pipeline that does it >> for you for each database release you have to load. Values for the >> database release entry could either go in the properties file or be parsed >> from the file you are loading depending on availability. >> >> This is easy enough to implement, but often we need to use the database >> release ids later in the pipeline. There isn't any way to automatically >> set these as internal properties; we really need to add them by hand to >> the pipeline properties file. >> >> I am going to add a method to Manager.pm that the pipeline can call named >> "waitForUser" (or something similar). It will just exit the pipeline with >> a message, and in this case it can say "Please set the following property >> in the pipeline properties file: flyDB=XXXX". It is sort of a generic >> implementation of the 'exitToLiniac' method that we have already. Then >> the user can set this property and start the pipeline again. >> >> Dave >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >> Use IT products in your business? Tell us what you think of them. Give us >> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >> http://productguide.itmanagersjournal.com/guidepromo.tmpl >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: <pi...@pc...> - 2004-10-26 15:31:29
|
Hi Ed, The hierarchy for choosing the example id is swiss-prot->pir->longest description (sorry for the etc. before, I had to go look in LoadNRDB to remember). This hierarchy was chosen because we prefered having links to swiss-prot but if that was not available then to PIR and if neither was available, we would use the entry with what we hoped was the most informative (longest) description. It was an internal decision but I think a reasonable one. Debbie Quoting Ed Robinson <ed_...@be...>: > I knew it was a unique sequence over multiple redundant sources, I didn't > realize that they retained all the source ids with it. > > What is the full hierarchy for choosing which exemplar? Is this hierarchy a > GUS internal hierarchy, or is it a hierarchy used by the larger community > (e.g., we start with Genbank, and then the next two primary depositories > (EMBL, DBJ)... etc. > > Thanks, this is really helping me understand the logic internal to this > pluggin. > > -Ed > > > > > From: pi...@pc... > > Date: 2004/10/26 Tue AM 10:52:48 EDT > > To: Ed Robinson <ed_...@be...> > > CC: gus...@li... > > Subject: [Gusdev-gusdev] Re: Re: Corrupt NRDB? > > > > Hi Ed, > > > > The point of nrdb is that it is supposed to consolidate ids from multiple > > sources that represent the same sequence thus creating a non-redundant > > database. The plugin was written to put each of the ids into > dots.nrdbentry > > where the multiple rows from a single record would refer to a single > sequence > > in dots.externalaasequence. One of the ids would be the exemplar in > > dots.externalaasequence based on a hierarchy (swiss-prot->pir->etc.). > > > > This is one of the reasons that LoadNRDB is rather complicated because it > > updates these two tables. In addition, LoadNRDB attaches a taxon_id to each > row > > of NRDBEntry which is missing from the nr records from NCBI but provided > via a > > tax_id from the protein-gi to ncbi_tax_id file. > > > > Debbie > > > > Quoting Ed Robinson <ed_...@be...>: > > > > > Thanks. Everything is working fine then. I didn't realize that was part > of > > > the nrdb formatting. > > > > > > What exactly is the rule for using the escape header in these cases? Are > > > they multiple ids associated with one sequence? And what exactly does > > > NRDBLoad do with these. Does it just enter the last id number, or each > of > > > them? > > > > > > -ed > > > > > > > > > > > > > > From: pi...@pc... > > > > Date: 2004/10/25 Mon PM 04:01:06 EDT > > > > To: Ed Robinson <ed_...@be...> > > > > Subject: Re: Corrupt NRDB? > > > > > > > > Hi Ed, > > > > > > > > The last time I downloaded nrdb was on September 20 and the entry with > > > source_id > > > > = AN61313.1 looked fine: > > > > > > > > >gi|24430922|gb|AAN61313.1| cytochrome oxidase subunit III [Cicindela > > > > aureola]gi|24430920|gb|AAN61312.1| cytochrome oxidase subunit III > > > [Cicindela > > > > hemichrysea] > > > > > > > > > > GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAY > > > > > > > > > > EYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFVDVVWLFLYISIYWW > > > > > > > > Of source, you can't see the ^A here, that separates the two sources. I > > > didn't > > > > have a problem with that file. > > > > > > > > I don't see the oddness you see but in the past I have encountered > entries > > > that > > > > have not conformed to the format but not in the way you are describing. > If > > > you > > > > have downloaded multiple times, I would be suspicious of NCBI. The > current > > > > version of LoadNRDB.pm handles failures by printing the sequence into > > > STDERR. I > > > > printed sequence because that seemed to be the only reliably present > part > > > of a > > > > record. The whole thing should fail if the number of failures is over > 100. > > > > This wouldn't work well with what you are describing but perhaps could > get > > > you > > > > cloase enough to the record(s) find the error(s). > > > > > > > > -Debbie > > > > > > > > > > > > > > > > Quoting Ed Robinson <ed_...@be...>: > > > > > > > > > Debbie, > > > > > > > > > > I have been taking a very close look at the NR database, and I have > found > > > a > > > > > number of bad entries near the tail end of the file. Generally, > these > > > show > > > > > up as entries which have no sequence and, instead of having a ">" to > > > start > > > > > the next entry, the bad entries tail right into the next entry. In > some > > > > > editors, there are diamonds where the carriage return should be. > > > > > I have downloaded the nr file a number of ways, but I am still > finding > > > these > > > > > errors. Can you tell me if you find odd things with your NRDB also? > A > > > good > > > > > set of IDs to look at are the following: > > > > > > > > > > > > > > > > > > > > GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAYEYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFV|146 > > > > > ||176382|37| cytochrome oxidase subunit III [Cicindela > > > > > > > > > > > > > > > aureola]gi|24430922|AAN61313.1|ExternalAASequence|1|1|1|1|1|1|1|1|1|0|GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAYEYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFVDVVWLFLYISIYWWGS|162 > > > > > > > > > > > > > > > Let me know if you also have problems. > > > > > > > > > > > > > > > thanks > > > > > > > > > > -Ed > > > > > > > > > > > > > > > > > > > > > > > > > > > From: pi...@pc... > > > > > > Date: 2004/10/16 Sat AM 11:51:59 EDT > > > > > > To: Ed Robinson <ed_...@be...> > > > > > > CC: gus...@li... > > > > > > Subject: Re: [Gusdev-gusdev] New LoadNRDB & Consolidated GUS > install > > > > > package > > > > > > > > > > > > I recently (within the last 3 weeks) loaded an entirely new version > of > > > nrdb > > > > > and > > > > > > it took less than 24 hours. This should have been equivalent to a > first > > > > > load. I > > > > > > think that something else was wrong when you ran the plugin, > possibly > > > with > > > > > the > > > > > > database (e.g. indexes missing, a need to update statistics). > > > > > > > > > > > > I agree that LoadNRDB needs an upgrade but I think its poor > performance > > > in > > > > > this > > > > > > case is due to some other problem. > > > > > > > > > > > > -Debbie > > > > > > > > > > > > > > > > > > > > > > > > Quoting Ed Robinson <ed_...@be...>: > > > > > > > > > > > > > As many of you know, we have been doing quite a few GUS installs > down > > > > > here, > > > > > > > and this has pushed me to try and simplify this process as much > as > > > > > possible. > > > > > > > I am now far enough along on a couple things to bring them up on > the > > > > > list. > > > > > > > > > > > > > > First, installing NRDB the first time in GUS is a horribly > painful > > > > > process > > > > > > > using the exisintg plugin and this pain seems to be needless > since it > > > is > > > > > an > > > > > > > empty database. To this end, I have written a couple scripts and > a > > > batch > > > > > > > process for Oracle SQLLoader which accomplishes in about an hour > what > > > > > takes a > > > > > > > few weeks with the plugin. However, to make this work, I have to > > > reserve > > > > > > > early rows in a number of SRES tables for meaningful entries in > > > columns > > > > > such > > > > > > > as row_alg_invocation_id. Hence, my first discussion item: > Should > > > we > > > > > > > consider reserving early values in a number of the SRes tables to > > > serve > > > > > as > > > > > > > standard values. We already require that some rows be entered in > GUS > > > > > early > > > > > > > on to make some-things work such as LoadReviewType. It would > seem > > > that > > > > > we > > > > > > > should Pre-populate some of these tables with basic values that > we > > > can > > > > > then > > > > > > > refer to as standard values for bootstrapping operations such as > a > > > bulk > > > > > load > > > > > > > of NRDB. Does anyone else see any value in this and, if so, what > > > fileds > > > > > > > should we create standard entries for? Also, is there anything > else > > > that > > > > > > > would be amenable to a batch process for bootstrapping? (Note: I > do > > > NOT > > > > > > > think any organisim specific data is amenable to bootstrapping. > That > > > is > > > > > what > > > > > > > a (object based) pipeline is for. Also, this batch process is > only > > > good > > > > > if > > > > > > > you are using Oracle, but a similar process cab be written there > > > too.) > > > > > > > > > > > > > > This also gets me to some of the other scripts we use to > bootstrap > > > GUS, > > > > > such > > > > > > > as the predefined set of ExternalDatabases we load. The XML > which I > > > use > > > > > to > > > > > > > load this is pretty messy, and not well documented. Does anyone > mind > > > if > > > > > I > > > > > > > clean it up? If the answer is yes, is there anything I should > know > > > about > > > > > > > this file? it seems that the XML for this table load is a nice > one > > > to > > > > > > > clean-up and make standard for GUS installations all over since > it > > > will > > > > > push > > > > > > > gus to be standardized across installations. What else should we > > > > > > > standardize? > > > > > > > > > > > > > > Which now brings me to the last item I want to open up which is > that > > > I am > > > > > > > close to completing a full GUS installation wrapper script which > > > > > essentially > > > > > > > makes a GUS installation a click-and-play operation. One of our > > > > > > > deliverables is supposed to be an easy to install GUS package. > > > > > Regardless of > > > > > > > the state of GUS with regards to an official release, this script > is > > > > > going to > > > > > > > make my life a whole lot easier. I figure it might be nice to > > > package > > > > > the > > > > > > > whole kit-n-kaboodle up into one nice fat tarball with a simple > set > > > of > > > > > > > instructions for download from someplace. Is anyone else > interested > > > in > > > > > this? > > > > > > > > > > > > > > Finally, one quick question I have about the NRDB load is that > > > working on > > > > > it > > > > > > > showed me that the description filed in AASequenceIMP is too > short > > > for > > > > > many > > > > > > > of the descriptions in NRDB. Do we want to up the description > field > > > size > > > > > for > > > > > > > dots.aasequenceimp? > > > > > > > > > > > > > > Anyway, any feedback on this would be appreciated. > > > > > > > > > > > > > > -Ed R > > > > > > > > > > > > > > > > > > > > > Ed Robinson > > > > > > > 255 Deerfield Rd > > > > > > > Bogart, GA 30622 > > > > > > > (706)425-9181 > > > > > > > > > > > > > > --Learn more about the face of your neighbor, and less about your > > > own. > > > > > > > -Sargent Shriver > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > > This SF.net email is sponsored by: IT Product Guide on > > > ITManagersJournal > > > > > > > Use IT products in your business? Tell us what you think of them. > > > Give us > > > > > > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to > find > > > out > > > > > more > > > > > > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > > > > > > _______________________________________________ > > > > > > > Gusdev-gusdev mailing list > > > > > > > Gus...@li... > > > > > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > This SF.net email is sponsored by: IT Product Guide on > > > ITManagersJournal > > > > > > Use IT products in your business? Tell us what you think of them. > Give > > > us > > > > > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find > out > > > more > > > > > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > > > > > _______________________________________________ > > > > > > Gusdev-gusdev mailing list > > > > > > Gus...@li... > > > > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > > > > > > > > > > Ed Robinson > > > > > 255 Deerfield Rd > > > > > Bogart, GA 30622 > > > > > (706)425-9181 > > > > > > > > > > --Learn more about the face of your neighbor, and less about your > own. > > > > > -Sargent Shriver > > > > > > > > > > > > > > > > > > > > > > > Ed Robinson > > > 255 Deerfield Rd > > > Bogart, GA 30622 > > > (706)425-9181 > > > > > > --Learn more about the face of your neighbor, and less about your own. > > > -Sargent Shriver > > > > > > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > Ed Robinson > 255 Deerfield Rd > Bogart, GA 30622 > (706)425-9181 > > --Learn more about the face of your neighbor, and less about your own. > -Sargent Shriver > |
From: John I. <io...@pc...> - 2004-10-26 15:18:35
|
Adding a pipeline step for new external database release IDs is a great idea. I have noticed in the past that when people insert a record "manually" (that is, with UpdateGusFromXML or SubmitRow), they tend to supply only the minimum info, to the extent of populating required fields with the string of "unknown". I've been guilty of this myself. Hopefully, records created programmatically will document the DB release better. Question: is there a way to avoid editing the properties file to include the new release ID? In many cases, we want the latest release of the given external database. Could we create a utility that takes an external database ID and finds the ID of its latest release? If we could do something like that, it would not only simplify a manual step, it would completely automate it. On Mon, 2004-10-25 at 17:39, Dave Barkan wrote: > Hey all, > > I noticed something that is a bit of a pain in the neck when running our > pipeline; whenever we load external data, we have to make sure that there > is a new entry in the ExternalDatabaseRelease table for that data. The > way I've always handled this is to make those by hand before the pipeline > runs and set the database release IDs in the pipeline properties file. > > I think a better way would be to have a step in the Pipeline that does it > for you for each database release you have to load. Values for the > database release entry could either go in the properties file or be parsed > from the file you are loading depending on availability. > > This is easy enough to implement, but often we need to use the database > release ids later in the pipeline. There isn't any way to automatically > set these as internal properties; we really need to add them by hand to > the pipeline properties file. > > I am going to add a method to Manager.pm that the pipeline can call named > "waitForUser" (or something similar). It will just exit the pipeline with > a message, and in this case it can say "Please set the following property > in the pipeline properties file: flyDB=XXXX". It is sort of a generic > implementation of the 'exitToLiniac' method that we have already. Then > the user can set this property and start the pipeline again. > > Dave > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: <pi...@pc...> - 2004-10-26 14:52:54
|
Hi Ed, The point of nrdb is that it is supposed to consolidate ids from multiple sources that represent the same sequence thus creating a non-redundant database. The plugin was written to put each of the ids into dots.nrdbentry where the multiple rows from a single record would refer to a single sequence in dots.externalaasequence. One of the ids would be the exemplar in dots.externalaasequence based on a hierarchy (swiss-prot->pir->etc.). This is one of the reasons that LoadNRDB is rather complicated because it updates these two tables. In addition, LoadNRDB attaches a taxon_id to each row of NRDBEntry which is missing from the nr records from NCBI but provided via a tax_id from the protein-gi to ncbi_tax_id file. Debbie Quoting Ed Robinson <ed_...@be...>: > Thanks. Everything is working fine then. I didn't realize that was part of > the nrdb formatting. > > What exactly is the rule for using the escape header in these cases? Are > they multiple ids associated with one sequence? And what exactly does > NRDBLoad do with these. Does it just enter the last id number, or each of > them? > > -ed > > > > > > From: pi...@pc... > > Date: 2004/10/25 Mon PM 04:01:06 EDT > > To: Ed Robinson <ed_...@be...> > > Subject: Re: Corrupt NRDB? > > > > Hi Ed, > > > > The last time I downloaded nrdb was on September 20 and the entry with > source_id > > = AN61313.1 looked fine: > > > > >gi|24430922|gb|AAN61313.1| cytochrome oxidase subunit III [Cicindela > > aureola]gi|24430920|gb|AAN61312.1| cytochrome oxidase subunit III > [Cicindela > > hemichrysea] > > > GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAY > > > EYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFVDVVWLFLYISIYWW > > > > Of source, you can't see the ^A here, that separates the two sources. I > didn't > > have a problem with that file. > > > > I don't see the oddness you see but in the past I have encountered entries > that > > have not conformed to the format but not in the way you are describing. If > you > > have downloaded multiple times, I would be suspicious of NCBI. The current > > version of LoadNRDB.pm handles failures by printing the sequence into > STDERR. I > > printed sequence because that seemed to be the only reliably present part > of a > > record. The whole thing should fail if the number of failures is over 100. > > This wouldn't work well with what you are describing but perhaps could get > you > > cloase enough to the record(s) find the error(s). > > > > -Debbie > > > > > > > > Quoting Ed Robinson <ed_...@be...>: > > > > > Debbie, > > > > > > I have been taking a very close look at the NR database, and I have found > a > > > number of bad entries near the tail end of the file. Generally, these > show > > > up as entries which have no sequence and, instead of having a ">" to > start > > > the next entry, the bad entries tail right into the next entry. In some > > > editors, there are diamonds where the carriage return should be. > > > I have downloaded the nr file a number of ways, but I am still finding > these > > > errors. Can you tell me if you find odd things with your NRDB also? A > good > > > set of IDs to look at are the following: > > > > > > > > > GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAYEYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFV|146 > > > ||176382|37| cytochrome oxidase subunit III [Cicindela > > > > > > aureola]gi|24430922|AAN61313.1|ExternalAASequence|1|1|1|1|1|1|1|1|1|0|GFFHSSLSPTVELGAMWPPAGISPFNPLQIPLLNTLILLTSGITVTWAHHGLMENNYTQALQGLFFTVILGIYFTALQAYEYFESPFTIADSVYGSTFFMATGFHGLHVIIGTTFLLVCLMRHWMNHFSSIHHFGFEAAAWYWHFVDVVWLFLYISIYWWGS|162 > > > > > > > > > Let me know if you also have problems. > > > > > > > > > thanks > > > > > > -Ed > > > > > > > > > > > > > > > > > From: pi...@pc... > > > > Date: 2004/10/16 Sat AM 11:51:59 EDT > > > > To: Ed Robinson <ed_...@be...> > > > > CC: gus...@li... > > > > Subject: Re: [Gusdev-gusdev] New LoadNRDB & Consolidated GUS install > > > package > > > > > > > > I recently (within the last 3 weeks) loaded an entirely new version of > nrdb > > > and > > > > it took less than 24 hours. This should have been equivalent to a first > > > load. I > > > > think that something else was wrong when you ran the plugin, possibly > with > > > the > > > > database (e.g. indexes missing, a need to update statistics). > > > > > > > > I agree that LoadNRDB needs an upgrade but I think its poor performance > in > > > this > > > > case is due to some other problem. > > > > > > > > -Debbie > > > > > > > > > > > > > > > > Quoting Ed Robinson <ed_...@be...>: > > > > > > > > > As many of you know, we have been doing quite a few GUS installs down > > > here, > > > > > and this has pushed me to try and simplify this process as much as > > > possible. > > > > > I am now far enough along on a couple things to bring them up on the > > > list. > > > > > > > > > > First, installing NRDB the first time in GUS is a horribly painful > > > process > > > > > using the exisintg plugin and this pain seems to be needless since it > is > > > an > > > > > empty database. To this end, I have written a couple scripts and a > batch > > > > > process for Oracle SQLLoader which accomplishes in about an hour what > > > takes a > > > > > few weeks with the plugin. However, to make this work, I have to > reserve > > > > > early rows in a number of SRES tables for meaningful entries in > columns > > > such > > > > > as row_alg_invocation_id. Hence, my first discussion item: Should > we > > > > > consider reserving early values in a number of the SRes tables to > serve > > > as > > > > > standard values. We already require that some rows be entered in GUS > > > early > > > > > on to make some-things work such as LoadReviewType. It would seem > that > > > we > > > > > should Pre-populate some of these tables with basic values that we > can > > > then > > > > > refer to as standard values for bootstrapping operations such as a > bulk > > > load > > > > > of NRDB. Does anyone else see any value in this and, if so, what > fileds > > > > > should we create standard entries for? Also, is there anything else > that > > > > > would be amenable to a batch process for bootstrapping? (Note: I do > NOT > > > > > think any organisim specific data is amenable to bootstrapping. That > is > > > what > > > > > a (object based) pipeline is for. Also, this batch process is only > good > > > if > > > > > you are using Oracle, but a similar process cab be written there > too.) > > > > > > > > > > This also gets me to some of the other scripts we use to bootstrap > GUS, > > > such > > > > > as the predefined set of ExternalDatabases we load. The XML which I > use > > > to > > > > > load this is pretty messy, and not well documented. Does anyone mind > if > > > I > > > > > clean it up? If the answer is yes, is there anything I should know > about > > > > > this file? it seems that the XML for this table load is a nice one > to > > > > > clean-up and make standard for GUS installations all over since it > will > > > push > > > > > gus to be standardized across installations. What else should we > > > > > standardize? > > > > > > > > > > Which now brings me to the last item I want to open up which is that > I am > > > > > close to completing a full GUS installation wrapper script which > > > essentially > > > > > makes a GUS installation a click-and-play operation. One of our > > > > > deliverables is supposed to be an easy to install GUS package. > > > Regardless of > > > > > the state of GUS with regards to an official release, this script is > > > going to > > > > > make my life a whole lot easier. I figure it might be nice to > package > > > the > > > > > whole kit-n-kaboodle up into one nice fat tarball with a simple set > of > > > > > instructions for download from someplace. Is anyone else interested > in > > > this? > > > > > > > > > > Finally, one quick question I have about the NRDB load is that > working on > > > it > > > > > showed me that the description filed in AASequenceIMP is too short > for > > > many > > > > > of the descriptions in NRDB. Do we want to up the description field > size > > > for > > > > > dots.aasequenceimp? > > > > > > > > > > Anyway, any feedback on this would be appreciated. > > > > > > > > > > -Ed R > > > > > > > > > > > > > > > Ed Robinson > > > > > 255 Deerfield Rd > > > > > Bogart, GA 30622 > > > > > (706)425-9181 > > > > > > > > > > --Learn more about the face of your neighbor, and less about your > own. > > > > > -Sargent Shriver > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > This SF.net email is sponsored by: IT Product Guide on > ITManagersJournal > > > > > Use IT products in your business? Tell us what you think of them. > Give us > > > > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find > out > > > more > > > > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > > > > _______________________________________________ > > > > > Gusdev-gusdev mailing list > > > > > Gus...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > This SF.net email is sponsored by: IT Product Guide on > ITManagersJournal > > > > Use IT products in your business? Tell us what you think of them. Give > us > > > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more > > > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > > > _______________________________________________ > > > > Gusdev-gusdev mailing list > > > > Gus...@li... > > > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > > > > > > Ed Robinson > > > 255 Deerfield Rd > > > Bogart, GA 30622 > > > (706)425-9181 > > > > > > --Learn more about the face of your neighbor, and less about your own. > > > -Sargent Shriver > > > > > > > > > > > Ed Robinson > 255 Deerfield Rd > Bogart, GA 30622 > (706)425-9181 > > --Learn more about the face of your neighbor, and less about your own. > -Sargent Shriver > |
From: Jeetendra S. <so...@vb...> - 2004-10-26 13:58:10
|
Hi all, When we try to upload an existing sequence (and features) using the GBParser, it updates the features of that sequence, i.e. it removes the old and inserts the new ones. However, is there a way to just ADD features (of any type) to the existing sequence in the GUS DB ? Thanks a lot for your help, Jeetendra. |
From: Steve F. <sfi...@pc...> - 2004-10-25 23:52:43
|
Dave- ok, but, the good news is that we will be renovating this department, and have in mind an improved facility for loading 3rd party data. steve Dave Barkan wrote: > Hey all, > > I noticed something that is a bit of a pain in the neck when running > our pipeline; whenever we load external data, we have to make sure > that there is a new entry in the ExternalDatabaseRelease table for > that data. The way I've always handled this is to make those by hand > before the pipeline runs and set the database release IDs in the > pipeline properties file. > > I think a better way would be to have a step in the Pipeline that does > it for you for each database release you have to load. Values for the > database release entry could either go in the properties file or be > parsed from the file you are loading depending on availability. > > This is easy enough to implement, but often we need to use the > database release ids later in the pipeline. There isn't any way to > automatically set these as internal properties; we really need to add > them by hand to the pipeline properties file. > > I am going to add a method to Manager.pm that the pipeline can call > named "waitForUser" (or something similar). It will just exit the > pipeline with a message, and in this case it can say "Please set the > following property in the pipeline properties file: flyDB=XXXX". It > is sort of a generic implementation of the 'exitToLiniac' method that > we have already. Then the user can set this property and start the > pipeline again. > > Dave > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Dave B. <db...@pc...> - 2004-10-25 21:39:19
|
Hey all, I noticed something that is a bit of a pain in the neck when running our pipeline; whenever we load external data, we have to make sure that there is a new entry in the ExternalDatabaseRelease table for that data. The way I've always handled this is to make those by hand before the pipeline runs and set the database release IDs in the pipeline properties file. I think a better way would be to have a step in the Pipeline that does it for you for each database release you have to load. Values for the database release entry could either go in the properties file or be parsed from the file you are loading depending on availability. This is easy enough to implement, but often we need to use the database release ids later in the pipeline. There isn't any way to automatically set these as internal properties; we really need to add them by hand to the pipeline properties file. I am going to add a method to Manager.pm that the pipeline can call named "waitForUser" (or something similar). It will just exit the pipeline with a message, and in this case it can say "Please set the following property in the pipeline properties file: flyDB=XXXX". It is sort of a generic implementation of the 'exitToLiniac' method that we have already. Then the user can set this property and start the pipeline again. Dave |
From: Michael S. <msa...@pc...> - 2004-10-21 16:36:49
|
Paul's comments are correct. The information for the new GUSDBA mailing list is at: https://mail.pcbi.upenn.edu/mailman/listinfo/gusdba --Mike Paul Mooney wrote: > Ed, > > I've increased a lot of our description fields to 2048 but commiting > them to CVS has no real affect. Angel and Co. dump the schema they have > from the Oracle DB installed at CBIL. I think this issue has been > addressed already on this mailing list - I can't find the recent email > about a new install system for GUS that will address this... > > Paul/ > > On 20 Oct 2004, at 19:46, Ed Robinson wrote: > >> NRDB has a number of descriptions which are longer than the >> varchar(255) field GUS allots for them. Presently these are being >> truncated when they are entered into the database. Would anyone be >> opposed to upping the description field size to fit these in? >> >> -Ed R >> >> >> Ed Robinson >> 255 Deerfield Rd >> Bogart, GA 30622 >> (706)425-9181 >> >> --Learn more about the face of your neighbor, and less about your own. >> -Sargent Shriver >> >> >> >> ------------------------------------------------------- >> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >> Use IT products in your business? Tell us what you think of them. Give us >> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out >> more >> http://productguide.itmanagersjournal.com/guidepromo.tmpl >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |