You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(34) |
Aug
(14) |
Sep
(10) |
Oct
(10) |
Nov
(11) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(56) |
Feb
(76) |
Mar
(68) |
Apr
(11) |
May
(97) |
Jun
(16) |
Jul
(29) |
Aug
(35) |
Sep
(18) |
Oct
(32) |
Nov
(23) |
Dec
(77) |
2004 |
Jan
(52) |
Feb
(44) |
Mar
(55) |
Apr
(38) |
May
(106) |
Jun
(82) |
Jul
(76) |
Aug
(47) |
Sep
(36) |
Oct
(56) |
Nov
(46) |
Dec
(61) |
2005 |
Jan
(52) |
Feb
(118) |
Mar
(41) |
Apr
(40) |
May
(35) |
Jun
(99) |
Jul
(84) |
Aug
(104) |
Sep
(53) |
Oct
(107) |
Nov
(68) |
Dec
(30) |
2006 |
Jan
(19) |
Feb
(27) |
Mar
(24) |
Apr
(9) |
May
(22) |
Jun
(11) |
Jul
(34) |
Aug
(8) |
Sep
(15) |
Oct
(55) |
Nov
(16) |
Dec
(2) |
2007 |
Jan
(12) |
Feb
(4) |
Mar
(8) |
Apr
|
May
(19) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
(12) |
Oct
(3) |
Nov
|
Dec
|
2008 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(21) |
2009 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(19) |
Jun
(14) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(22) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Michael S. <msa...@pc...> - 2005-07-18 16:17:32
|
Please note the following repository changes, currently being made, which were discussed and approved at the Workshop. GUS/trunk/Schema has been promoted to it's own project, GusSchema GUS has been renamed to GusAppFramework It will most likely be necessary to recheckout/switch to the new directories. This should not affect any local code. In the GUS repository: svn switch https://www.cbil.upenn.edu/svn/gus/GusAppFramework/trunk --Mike |
From: Steve F. <sfi...@pc...> - 2005-07-17 20:37:01
|
gus folks- we are encountering a type of data that we haven't had to deal with yet, and I think the best way to handle it is a change to the schema. The change is: add: na_sequence_id to NALocation remove: na_sequence_id from NAFeature and all its subclasses. In other words, the location of a feature specifies what sequence it belongs to, rather than the feature specifying that directly itself. This enables a feature to exist on more than one sequence. The data we have is scaffolds and a genetic map. We use the map to order and orient the scaffolds. We also submit the scaffolds to our analysis pipeline which produces features on the scaffolds We store the scaffolds as SequencePieces, and the chromosome as a VirtualSequence. We would like our presentation layer, eg GBrowse, to be able to display the features on the chromosome as well as on the scaffolds, with correctly transformed locations. This means that we have to project the SequencePiece features onto the VirtualSequence. We have considered many alternative ways of doing this projection (Aaron and I and others). It is now clear to me that the most elegant and practical approach is to allow NAFeatures to have NALocations on more than one Sequence. Given that schema, we can add a final analysis step to our pipeline that easily does the projection by creating a new set of NALocations that attach the NAFeatures from the SequencePieces to the VirtualSequence. The downsides that I see to this approach are: 1. a change to the schema 2. in the case that a program wants to iterate across the features of a sequence without regard to their location, the query will have an additional join. i think this is probably a rare case. I would propose this as a feature enhancement to GUS Schema 3.6 Encouragments? Objections? thanks, steve |
From: Elisabetta M. <man...@pc...> - 2005-07-16 20:28:21
|
Apologies if you receive this more than once. I will submit this to the tracker as soon as I solve a problem accessing the latter. In any case, it appears that the StudyBioMaterial should belong to Study, not to RAD, because it's a linking table between 2 tables in Study. (The tables StudyAssay and AssayBioMaterial should however stay in RAD where they currently are.) Elisabetta |
From: <sfi...@pc...> - 2005-07-16 15:07:39
|
features have other relations besides to themselves. for example, they have relations to GeneInstance and AnalysisAlgorithm and ReviewStatus, etc. I'll have to look into it more. But, my concern is that the cloning of a features is not that straightforward. steve Quoting "Aaron J. Mackey" <am...@pc...>: > We solved it by just copying the feature trees directly at the > NAFeatureImp table ... 5 or so lines of code, and no big deal. > > Remember that we may want to actually do this coordinate mapping > along multiple alternative coordinate systems, so the post-processing > is really more of an entirely separate InsertNewCoordinateSystem.pm > plugin that takes (source, target) tuples that identify a given > mapping found in SequencePiece. Better plugin name suggestions > welcome. > > -Aaron > > steve wrote: >> i agree that we need to project copies of the scaffold features onto >> the virtual chromosome. >> >> but, i want to point out that this may be a bit tricky if done as a >> post-process. the reason is that a "feature" spans multiple >> tables. so, the copying of a feature means the traversal of a tree >> its child objects. how does the post-process program know what >> that tree is? >> >> one way is for the programmer of that program to use human knowledge >> of the schema to produce the possible tree, and traverse it. >> >> another way is to use schema information to generate the tree. >> >> an alternative approach would be: >> 1. create the virtual sequence as a pre-process that does not write >> features. >> 2. any plugin that writes features has the option to take a virtual >> sequence. if given that, it would read all the virtual sequence's >> pieces to determine their offset. it would use their source_id to >> correlate them with the input, and as the features are created do >> the project them simultaneously on both the piece and the virtual >> sequence. >> >> that sounds kind of complicated, so probably the post-process is better. >> >> its kind of late and i'm kind of foggy... >> >> steve >> >> >> >> >> >> >> Chris Stoeckert wrote: >> >>> No, these are different features because they are spans on >>> different sequences (one scaffold and one virtual) so you won't >>> get two locations based on this for the same na_feature_id. >>> NAFeature has the na_sequence_id which tells you whether it is the >>> scaffold or virtual sequence. If these are Gene, RNA, or Protein >>> features then you can say that they are the same conceptual >>> feature through the central dogma and instance tables. If they are >>> features like Exon, then you could infer this as you say by >>> parent_id, source_id, etc. >>> >>> Chris >>> >>> On Jul 14, 2005, at 5:52 PM, Aaron J. Mackey wrote: >>> >>>> >>>> Exactly. No logic is required, because we simply copy any and all >>>> NALocation objects attached to the sequences and generate new >>>> NALocation objects that point to the virtual sequence, with new >>>> coordinate/strand, but all other foreign keys remain the same >>>> (i.e. children of the same feature). >>>> >>>> Hmm, that means that if you blindly pull locations for a given >>>> feature, you will get two locations, not just one (so you'll need >>>> to specify which reference sequence you wish to obtain the >>>> location on). >>>> >>>> -Aaron >>>> >>>> On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: >>>> >>>> >>>>> Let's see if I understand your proposal. Generate features and >>>>> locations based on the static scaffold sequence coordinates. Then >>>>> at the end of the pipeline generate the same (conceptual) >>>>> features with locations based on the virtual sequence >>>>> coordinates. That makes sense to me. The advantage is that you >>>>> have both, one that is stable (scaffold) and one that can be >>>>> regenerated as needed (virtual) but stored for convenience. I >>>>> don't really see a disadvantage - sure it's twice as many rows >>>>> but if you materialize a view you adding these anyway. >>>>> >>>>> Chris >>>>> >>>>> On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: >>>>> >>>>> >>>>> >>>>>> >>>>>> As we struggle to use GUS the "right way", this is throwing us >>>>>> for a loop. On the one hand, our GUS client applications want >>>>>> to see features in the coordinate system of the assembly (i.e. >>>>>> the virtual sequence) -- on the other hand, it makes sense from >>>>>> a data integrity viewpoint to only load/store feature >>>>>> coordinates with respect to the static underlying scaffold >>>>>> coordinates, since the scaffold-to-chromosome mapping (as >>>>>> defined by DoTS.SequencePiece) may change over time. >>>>>> >>>>>> One option is to instantiate a read-only materialized view of >>>>>> the NALocation for clients to use. >>>>>> >>>>>> A second option (which we've just discussed, and people seem to >>>>>> like) is for the InsertVirtualSequenceFromMapping plugin we just >>>>>> wrote to (re)generate duplicate versions of all NALocations >>>>>> attached to a given SequencePiece in the new coordinate system >>>>>> (requiring the virtual sequence building to be the last step in >>>>>> our pipeline, instead of the first). >>>>>> >>>>>> -Aaron >>>>>> >>>>>> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hi Aaron, >>>>>>> I don't have a strong argument for either way. In terms of >>>>>>> coordinate mapping utilities, I'm not aware of one so certainly >>>>>>> would welcome yours (but if others know of ones please speak >>>>>>> up). >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks Chris, I got it. >>>>>>>> >>>>>>>> If we are going to start hanging features off these, should we >>>>>>>> hang them off the virtual chromosome sequence entries, or the >>>>>>>> scaffold entries in externalnasequence? Would it make sense >>>>>>>> to "codify" this usage with associate PL/SQL code to >>>>>>>> reconstruct virtual sequence and associated features in the >>>>>>>> virtual coordinate space? I guess one way to do this would >>>>>>>> be to have Virtual*Feature read-only views (and thus target >>>>>>>> everything to the "real" coordinate system such that future >>>>>>>> rebuilds of the virtual sequence would not require >>>>>>>> recalculation of feature locations)? >>>>>>>> >>>>>>>> Relatedly, is there coordinate mapping code already in some >>>>>>>> GUS utility module (if not, I'm happy to contribute mine, >>>>>>>> based on BioPerl's powerful Bio::Coordinate::Map framework)? >>>>>>>> >>>>>>>> -Aaron >>>>>>>> >>>>>>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hi Aaron, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> 1) VirtualSequence has a required sequence_version attribute >>>>>>>>>> - what is this for? Is this redundant to >>>>>>>>>> external_database_release_id? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> This is a superclass attribute inherited by all NASequence >>>>>>>>> views. My recollection is that individual GenBank sequence >>>>>>>>> entries have version tags at the end of accessions as in >>>>>>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette >>>>>>>>> protein subfamily B member 3 (found in VERSION field). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>>>>>> pieces. Am I correct in this usage, or should I not, in >>>>>>>>>> fact, be storing the assembled sequence in VirtualSequence? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Again this is a superclass attribute. I think using it is >>>>>>>>> optional. Reasons not to use it are that the virtual sequence >>>>>>>>> is hard to represent as a single entity (e.g., contains >>>>>>>>> gaps) or is very large and has a significant overhead cost >>>>>>>>> of storing what can be easily regenerated (and avoid >>>>>>>>> denormalization). Reasons to use are for convenience and >>>>>>>>> efficiency of retrieving the sequence without the need to >>>>>>>>> rebuild. >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> -Aaron >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Aaron J. Mackey, Ph.D. >>>>>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>>>>> email: am...@pc... >>>>>>>>>> office: 215-898-1205 >>>>>>>>>> fax: 215-746-6697 >>>>>>>>>> postal: Penn Genomics Institute >>>>>>>>>> Goddard Labs 212 >>>>>>>>>> 415 S. University Avenue >>>>>>>>>> Philadelphia, PA 19104-6017 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------------------------------- >>>>>>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>>>>>> webinar happening >>>>>>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>>>>>> latest in dual >>>>>>>>>> core and dual graphics technology at this free one hour >>>>>>>>>> event hosted by HP,AMD, and NVIDIA. To register visit >>>>>>>>>> http:// www.hp.com/go/dualwebinar >>>>>>>>>> _______________________________________________ >>>>>>>>>> Gusdev-gusdev mailing list >>>>>>>>>> Gus...@li... >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Aaron J. Mackey, Ph.D. >>>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>>> email: am...@pc... >>>>>>>> office: 215-898-1205 >>>>>>>> fax: 215-746-6697 >>>>>>>> postal: Penn Genomics Institute >>>>>>>> Goddard Labs 212 >>>>>>>> 415 S. University Avenue >>>>>>>> Philadelphia, PA 19104-6017 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------- >>>>>>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>>>>>> Strategies >>>>>>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>>>>>> informative Webcasts and more! Get everything you need to get up to >>>>>>>> speed, fast. http://ads.osdn.com/? ad_id=7477&alloc_id=16492&op=click >>>>>>>> _______________________________________________ >>>>>>>> Gusdev-gusdev mailing list >>>>>>>> Gus...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Aaron J. Mackey, Ph.D. >>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>> email: am...@pc... >>>>>> office: 215-898-1205 >>>>>> fax: 215-746-6697 >>>>>> postal: Penn Genomics Institute >>>>>> Goddard Labs 212 >>>>>> 415 S. University Avenue >>>>>> Philadelphia, PA 19104-6017 >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >>>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>>> informative Webcasts and more! Get everything you need to get up to >>>>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>>>> _______________________________________________ >>>>> Gusdev-gusdev mailing list >>>>> Gus...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>> >>>>> >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>> >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>> informative Webcasts and more! Get everything you need to get up to >>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > |
From: Aaron J. M. <am...@pc...> - 2005-07-16 14:38:00
|
We solved it by just copying the feature trees directly at the NAFeatureImp table ... 5 or so lines of code, and no big deal. Remember that we may want to actually do this coordinate mapping along multiple alternative coordinate systems, so the post-processing is really more of an entirely separate InsertNewCoordinateSystem.pm plugin that takes (source, target) tuples that identify a given mapping found in SequencePiece. Better plugin name suggestions welcome. -Aaron steve wrote: > i agree that we need to project copies of the scaffold features onto the > virtual chromosome. > > but, i want to point out that this may be a bit tricky if done as a > post-process. the reason is that a "feature" spans multiple tables. > so, the copying of a feature means the traversal of a tree its child > objects. how does the post-process program know what that tree is? > > one way is for the programmer of that program to use human knowledge of > the schema to produce the possible tree, and traverse it. > > another way is to use schema information to generate the tree. > > an alternative approach would be: > 1. create the virtual sequence as a pre-process that does not write > features. > 2. any plugin that writes features has the option to take a virtual > sequence. if given that, it would read all the virtual sequence's > pieces to determine their offset. it would use their source_id to > correlate them with the input, and as the features are created do the > project them simultaneously on both the piece and the virtual sequence. > > that sounds kind of complicated, so probably the post-process is better. > > its kind of late and i'm kind of foggy... > > steve > > > > > > > Chris Stoeckert wrote: > >> No, these are different features because they are spans on different >> sequences (one scaffold and one virtual) so you won't get two >> locations based on this for the same na_feature_id. NAFeature has the >> na_sequence_id which tells you whether it is the scaffold or virtual >> sequence. If these are Gene, RNA, or Protein features then you can >> say that they are the same conceptual feature through the central >> dogma and instance tables. If they are features like Exon, then you >> could infer this as you say by parent_id, source_id, etc. >> >> Chris >> >> On Jul 14, 2005, at 5:52 PM, Aaron J. Mackey wrote: >> >>> >>> Exactly. No logic is required, because we simply copy any and all >>> NALocation objects attached to the sequences and generate new >>> NALocation objects that point to the virtual sequence, with new >>> coordinate/strand, but all other foreign keys remain the same (i.e. >>> children of the same feature). >>> >>> Hmm, that means that if you blindly pull locations for a given >>> feature, you will get two locations, not just one (so you'll need to >>> specify which reference sequence you wish to obtain the location on). >>> >>> -Aaron >>> >>> On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: >>> >>> >>>> Let's see if I understand your proposal. Generate features and >>>> locations based on the static scaffold sequence coordinates. Then >>>> at the end of the pipeline generate the same (conceptual) features >>>> with locations based on the virtual sequence coordinates. That >>>> makes sense to me. The advantage is that you have both, one that is >>>> stable (scaffold) and one that can be regenerated as needed >>>> (virtual) but stored for convenience. I don't really see a >>>> disadvantage - sure it's twice as many rows but if you materialize >>>> a view you adding these anyway. >>>> >>>> Chris >>>> >>>> On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: >>>> >>>> >>>> >>>>> >>>>> As we struggle to use GUS the "right way", this is throwing us for >>>>> a loop. On the one hand, our GUS client applications want to see >>>>> features in the coordinate system of the assembly (i.e. the >>>>> virtual sequence) -- on the other hand, it makes sense from a data >>>>> integrity viewpoint to only load/store feature coordinates with >>>>> respect to the static underlying scaffold coordinates, since the >>>>> scaffold-to-chromosome mapping (as defined by DoTS.SequencePiece) >>>>> may change over time. >>>>> >>>>> One option is to instantiate a read-only materialized view of the >>>>> NALocation for clients to use. >>>>> >>>>> A second option (which we've just discussed, and people seem to >>>>> like) is for the InsertVirtualSequenceFromMapping plugin we just >>>>> wrote to (re)generate duplicate versions of all NALocations >>>>> attached to a given SequencePiece in the new coordinate system >>>>> (requiring the virtual sequence building to be the last step in >>>>> our pipeline, instead of the first). >>>>> >>>>> -Aaron >>>>> >>>>> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> Hi Aaron, >>>>>> I don't have a strong argument for either way. In terms of >>>>>> coordinate mapping utilities, I'm not aware of one so certainly >>>>>> would welcome yours (but if others know of ones please speak up). >>>>>> >>>>>> Chris >>>>>> >>>>>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks Chris, I got it. >>>>>>> >>>>>>> If we are going to start hanging features off these, should we >>>>>>> hang them off the virtual chromosome sequence entries, or the >>>>>>> scaffold entries in externalnasequence? Would it make sense to >>>>>>> "codify" this usage with associate PL/SQL code to reconstruct >>>>>>> virtual sequence and associated features in the virtual >>>>>>> coordinate space? I guess one way to do this would be to have >>>>>>> Virtual*Feature read-only views (and thus target everything to >>>>>>> the "real" coordinate system such that future rebuilds of the >>>>>>> virtual sequence would not require recalculation of feature >>>>>>> locations)? >>>>>>> >>>>>>> Relatedly, is there coordinate mapping code already in some GUS >>>>>>> utility module (if not, I'm happy to contribute mine, based on >>>>>>> BioPerl's powerful Bio::Coordinate::Map framework)? >>>>>>> >>>>>>> -Aaron >>>>>>> >>>>>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Hi Aaron, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 1) VirtualSequence has a required sequence_version attribute - >>>>>>>>> what is this for? Is this redundant to >>>>>>>>> external_database_release_id? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> This is a superclass attribute inherited by all NASequence >>>>>>>> views. My recollection is that individual GenBank sequence >>>>>>>> entries have version tags at the end of accessions as in >>>>>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette protein >>>>>>>> subfamily B member 3 (found in VERSION field). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>>>>> pieces. Am I correct in this usage, or should I not, in fact, >>>>>>>>> be storing the assembled sequence in VirtualSequence? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Again this is a superclass attribute. I think using it is >>>>>>>> optional. Reasons not to use it are that the virtual sequence >>>>>>>> is hard to represent as a single entity (e.g., contains gaps) >>>>>>>> or is very large and has a significant overhead cost of storing >>>>>>>> what can be easily regenerated (and avoid denormalization). >>>>>>>> Reasons to use are for convenience and efficiency of retrieving >>>>>>>> the sequence without the need to rebuild. >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -Aaron >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Aaron J. Mackey, Ph.D. >>>>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>>>> email: am...@pc... >>>>>>>>> office: 215-898-1205 >>>>>>>>> fax: 215-746-6697 >>>>>>>>> postal: Penn Genomics Institute >>>>>>>>> Goddard Labs 212 >>>>>>>>> 415 S. University Avenue >>>>>>>>> Philadelphia, PA 19104-6017 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------- >>>>>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>>>>> webinar happening >>>>>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>>>>> latest in dual >>>>>>>>> core and dual graphics technology at this free one hour event >>>>>>>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>>>>>>> www.hp.com/go/dualwebinar >>>>>>>>> _______________________________________________ >>>>>>>>> Gusdev-gusdev mailing list >>>>>>>>> Gus...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Aaron J. Mackey, Ph.D. >>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>> email: am...@pc... >>>>>>> office: 215-898-1205 >>>>>>> fax: 215-746-6697 >>>>>>> postal: Penn Genomics Institute >>>>>>> Goddard Labs 212 >>>>>>> 415 S. University Avenue >>>>>>> Philadelphia, PA 19104-6017 >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------- >>>>>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>>>>> Strategies >>>>>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>>>>> informative Webcasts and more! Get everything you need to get up to >>>>>>> speed, fast. http://ads.osdn.com/? >>>>>>> ad_id=7477&alloc_id=16492&op=click >>>>>>> _______________________________________________ >>>>>>> Gusdev-gusdev mailing list >>>>>>> Gus...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Aaron J. Mackey, Ph.D. >>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>> Penn Genomics Institute, University of Pennsylvania >>>>> email: am...@pc... >>>>> office: 215-898-1205 >>>>> fax: 215-746-6697 >>>>> postal: Penn Genomics Institute >>>>> Goddard Labs 212 >>>>> 415 S. University Avenue >>>>> Philadelphia, PA 19104-6017 >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>> informative Webcasts and more! Get everything you need to get up to >>>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >> from IBM. Find simple to follow Roadmaps, straightforward articles, >> informative Webcasts and more! Get everything you need to get up to >> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: steve <sfi...@pc...> - 2005-07-16 03:39:07
|
i agree that we need to project copies of the scaffold features onto the virtual chromosome. but, i want to point out that this may be a bit tricky if done as a post-process. the reason is that a "feature" spans multiple tables. so, the copying of a feature means the traversal of a tree its child objects. how does the post-process program know what that tree is? one way is for the programmer of that program to use human knowledge of the schema to produce the possible tree, and traverse it. another way is to use schema information to generate the tree. an alternative approach would be: 1. create the virtual sequence as a pre-process that does not write features. 2. any plugin that writes features has the option to take a virtual sequence. if given that, it would read all the virtual sequence's pieces to determine their offset. it would use their source_id to correlate them with the input, and as the features are created do the project them simultaneously on both the piece and the virtual sequence. that sounds kind of complicated, so probably the post-process is better. its kind of late and i'm kind of foggy... steve Chris Stoeckert wrote: > No, these are different features because they are spans on different > sequences (one scaffold and one virtual) so you won't get two > locations based on this for the same na_feature_id. NAFeature has the > na_sequence_id which tells you whether it is the scaffold or virtual > sequence. If these are Gene, RNA, or Protein features then you can > say that they are the same conceptual feature through the central > dogma and instance tables. If they are features like Exon, then you > could infer this as you say by parent_id, source_id, etc. > > Chris > > On Jul 14, 2005, at 5:52 PM, Aaron J. Mackey wrote: > >> >> Exactly. No logic is required, because we simply copy any and all >> NALocation objects attached to the sequences and generate new >> NALocation objects that point to the virtual sequence, with new >> coordinate/strand, but all other foreign keys remain the same (i.e. >> children of the same feature). >> >> Hmm, that means that if you blindly pull locations for a given >> feature, you will get two locations, not just one (so you'll need to >> specify which reference sequence you wish to obtain the location on). >> >> -Aaron >> >> On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: >> >> >>> Let's see if I understand your proposal. Generate features and >>> locations based on the static scaffold sequence coordinates. Then >>> at the end of the pipeline generate the same (conceptual) features >>> with locations based on the virtual sequence coordinates. That >>> makes sense to me. The advantage is that you have both, one that is >>> stable (scaffold) and one that can be regenerated as needed >>> (virtual) but stored for convenience. I don't really see a >>> disadvantage - sure it's twice as many rows but if you materialize >>> a view you adding these anyway. >>> >>> Chris >>> >>> On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: >>> >>> >>> >>>> >>>> As we struggle to use GUS the "right way", this is throwing us for >>>> a loop. On the one hand, our GUS client applications want to see >>>> features in the coordinate system of the assembly (i.e. the >>>> virtual sequence) -- on the other hand, it makes sense from a data >>>> integrity viewpoint to only load/store feature coordinates with >>>> respect to the static underlying scaffold coordinates, since the >>>> scaffold-to-chromosome mapping (as defined by DoTS.SequencePiece) >>>> may change over time. >>>> >>>> One option is to instantiate a read-only materialized view of the >>>> NALocation for clients to use. >>>> >>>> A second option (which we've just discussed, and people seem to >>>> like) is for the InsertVirtualSequenceFromMapping plugin we just >>>> wrote to (re)generate duplicate versions of all NALocations >>>> attached to a given SequencePiece in the new coordinate system >>>> (requiring the virtual sequence building to be the last step in >>>> our pipeline, instead of the first). >>>> >>>> -Aaron >>>> >>>> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >>>> >>>> >>>> >>>> >>>>> Hi Aaron, >>>>> I don't have a strong argument for either way. In terms of >>>>> coordinate mapping utilities, I'm not aware of one so certainly >>>>> would welcome yours (but if others know of ones please speak up). >>>>> >>>>> Chris >>>>> >>>>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks Chris, I got it. >>>>>> >>>>>> If we are going to start hanging features off these, should we >>>>>> hang them off the virtual chromosome sequence entries, or the >>>>>> scaffold entries in externalnasequence? Would it make sense to >>>>>> "codify" this usage with associate PL/SQL code to reconstruct >>>>>> virtual sequence and associated features in the virtual >>>>>> coordinate space? I guess one way to do this would be to have >>>>>> Virtual*Feature read-only views (and thus target everything to >>>>>> the "real" coordinate system such that future rebuilds of the >>>>>> virtual sequence would not require recalculation of feature >>>>>> locations)? >>>>>> >>>>>> Relatedly, is there coordinate mapping code already in some GUS >>>>>> utility module (if not, I'm happy to contribute mine, based on >>>>>> BioPerl's powerful Bio::Coordinate::Map framework)? >>>>>> >>>>>> -Aaron >>>>>> >>>>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hi Aaron, >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 1) VirtualSequence has a required sequence_version attribute - >>>>>>>> what is this for? Is this redundant to >>>>>>>> external_database_release_id? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> This is a superclass attribute inherited by all NASequence >>>>>>> views. My recollection is that individual GenBank sequence >>>>>>> entries have version tags at the end of accessions as in >>>>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette protein >>>>>>> subfamily B member 3 (found in VERSION field). >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>>>> pieces. Am I correct in this usage, or should I not, in fact, >>>>>>>> be storing the assembled sequence in VirtualSequence? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Again this is a superclass attribute. I think using it is >>>>>>> optional. Reasons not to use it are that the virtual sequence >>>>>>> is hard to represent as a single entity (e.g., contains gaps) >>>>>>> or is very large and has a significant overhead cost of storing >>>>>>> what can be easily regenerated (and avoid denormalization). >>>>>>> Reasons to use are for convenience and efficiency of retrieving >>>>>>> the sequence without the need to rebuild. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Aaron >>>>>>>> >>>>>>>> -- >>>>>>>> Aaron J. Mackey, Ph.D. >>>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>>> email: am...@pc... >>>>>>>> office: 215-898-1205 >>>>>>>> fax: 215-746-6697 >>>>>>>> postal: Penn Genomics Institute >>>>>>>> Goddard Labs 212 >>>>>>>> 415 S. University Avenue >>>>>>>> Philadelphia, PA 19104-6017 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------- >>>>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>>>> webinar happening >>>>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>>>> latest in dual >>>>>>>> core and dual graphics technology at this free one hour event >>>>>>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>>>>>> www.hp.com/go/dualwebinar >>>>>>>> _______________________________________________ >>>>>>>> Gusdev-gusdev mailing list >>>>>>>> Gus...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Aaron J. Mackey, Ph.D. >>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>> email: am...@pc... >>>>>> office: 215-898-1205 >>>>>> fax: 215-746-6697 >>>>>> postal: Penn Genomics Institute >>>>>> Goddard Labs 212 >>>>>> 415 S. University Avenue >>>>>> Philadelphia, PA 19104-6017 >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------- >>>>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>>>> Strategies >>>>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>>>> informative Webcasts and more! Get everything you need to get up to >>>>>> speed, fast. http://ads.osdn.com/? >>>>>> ad_id=7477&alloc_id=16492&op=click >>>>>> _______________________________________________ >>>>>> Gusdev-gusdev mailing list >>>>>> Gus...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>>> >>>> >>> >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>> informative Webcasts and more! Get everything you need to get up to >>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Aaron J. M. <am...@pc...> - 2005-07-15 13:33:39
|
Got it, thanks. Something else to keep on our minds while we implement central dogma handling (that two instances of a Gene may in fact be the same physical instance represented in two coordinate spaces). Note that we will ultimately have at least three coordinate spaces (contig<->scaffold<->chromosome), and possibly central dogma related coordinate mappings (protein <-> mRNA <-> DNA). -Aaron On Jul 14, 2005, at 6:13 PM, Chris Stoeckert wrote: > No, these are different features because they are spans on > different sequences (one scaffold and one virtual) so you won't get > two locations based on this for the same na_feature_id. NAFeature > has the na_sequence_id which tells you whether it is the scaffold > or virtual sequence. If these are Gene, RNA, or Protein features > then you can say that they are the same conceptual feature through > the central dogma and instance tables. If they are features like > Exon, then you could infer this as you say by parent_id, source_id, > etc. > > Chris > > On Jul 14, 2005, at 5:52 PM, Aaron J. Mackey wrote: > > >> >> Exactly. No logic is required, because we simply copy any and all >> NALocation objects attached to the sequences and generate new >> NALocation objects that point to the virtual sequence, with new >> coordinate/strand, but all other foreign keys remain the same >> (i.e. children of the same feature). >> >> Hmm, that means that if you blindly pull locations for a given >> feature, you will get two locations, not just one (so you'll need >> to specify which reference sequence you wish to obtain the >> location on). >> >> -Aaron >> >> On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: >> >> >> >>> Let's see if I understand your proposal. Generate features and >>> locations based on the static scaffold sequence coordinates. Then >>> at the end of the pipeline generate the same (conceptual) >>> features with locations based on the virtual sequence >>> coordinates. That makes sense to me. The advantage is that you >>> have both, one that is stable (scaffold) and one that can be >>> regenerated as needed (virtual) but stored for convenience. I >>> don't really see a disadvantage - sure it's twice as many rows >>> but if you materialize a view you adding these anyway. >>> >>> Chris >>> >>> On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: >>> >>> >>> >>> >>>> >>>> As we struggle to use GUS the "right way", this is throwing us >>>> for a loop. On the one hand, our GUS client applications want >>>> to see features in the coordinate system of the assembly (i.e. >>>> the virtual sequence) -- on the other hand, it makes sense from >>>> a data integrity viewpoint to only load/store feature >>>> coordinates with respect to the static underlying scaffold >>>> coordinates, since the scaffold-to-chromosome mapping (as >>>> defined by DoTS.SequencePiece) may change over time. >>>> >>>> One option is to instantiate a read-only materialized view of >>>> the NALocation for clients to use. >>>> >>>> A second option (which we've just discussed, and people seem to >>>> like) is for the InsertVirtualSequenceFromMapping plugin we just >>>> wrote to (re)generate duplicate versions of all NALocations >>>> attached to a given SequencePiece in the new coordinate system >>>> (requiring the virtual sequence building to be the last step in >>>> our pipeline, instead of the first). >>>> >>>> -Aaron >>>> >>>> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >>>> >>>> >>>> >>>> >>>> >>>>> Hi Aaron, >>>>> I don't have a strong argument for either way. In terms of >>>>> coordinate mapping utilities, I'm not aware of one so certainly >>>>> would welcome yours (but if others know of ones please speak up). >>>>> >>>>> Chris >>>>> >>>>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks Chris, I got it. >>>>>> >>>>>> If we are going to start hanging features off these, should we >>>>>> hang them off the virtual chromosome sequence entries, or the >>>>>> scaffold entries in externalnasequence? Would it make sense >>>>>> to "codify" this usage with associate PL/SQL code to >>>>>> reconstruct virtual sequence and associated features in the >>>>>> virtual coordinate space? I guess one way to do this would be >>>>>> to have Virtual*Feature read-only views (and thus target >>>>>> everything to the "real" coordinate system such that future >>>>>> rebuilds of the virtual sequence would not require >>>>>> recalculation of feature locations)? >>>>>> >>>>>> Relatedly, is there coordinate mapping code already in some >>>>>> GUS utility module (if not, I'm happy to contribute mine, >>>>>> based on BioPerl's powerful Bio::Coordinate::Map framework)? >>>>>> >>>>>> -Aaron >>>>>> >>>>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Hi Aaron, >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 1) VirtualSequence has a required sequence_version attribute >>>>>>>> - what is this for? Is this redundant to >>>>>>>> external_database_release_id? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> This is a superclass attribute inherited by all NASequence >>>>>>> views. My recollection is that individual GenBank sequence >>>>>>> entries have version tags at the end of accessions as in >>>>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette >>>>>>> protein subfamily B member 3 (found in VERSION field). >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>>>> pieces. Am I correct in this usage, or should I not, in >>>>>>>> fact, be storing the assembled sequence in VirtualSequence? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Again this is a superclass attribute. I think using it is >>>>>>> optional. Reasons not to use it are that the virtual sequence >>>>>>> is hard to represent as a single entity (e.g., contains gaps) >>>>>>> or is very large and has a significant overhead cost of >>>>>>> storing what can be easily regenerated (and avoid >>>>>>> denormalization). Reasons to use are for convenience and >>>>>>> efficiency of retrieving the sequence without the need to >>>>>>> rebuild. >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Aaron >>>>>>>> >>>>>>>> -- >>>>>>>> Aaron J. Mackey, Ph.D. >>>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>>> email: am...@pc... >>>>>>>> office: 215-898-1205 >>>>>>>> fax: 215-746-6697 >>>>>>>> postal: Penn Genomics Institute >>>>>>>> Goddard Labs 212 >>>>>>>> 415 S. University Avenue >>>>>>>> Philadelphia, PA 19104-6017 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------- >>>>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>>>> webinar happening >>>>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>>>> latest in dual >>>>>>>> core and dual graphics technology at this free one hour >>>>>>>> event hosted by HP,AMD, and NVIDIA. To register visit >>>>>>>> http://www.hp.com/go/dualwebinar >>>>>>>> _______________________________________________ >>>>>>>> Gusdev-gusdev mailing list >>>>>>>> Gus...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Aaron J. Mackey, Ph.D. >>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>> email: am...@pc... >>>>>> office: 215-898-1205 >>>>>> fax: 215-746-6697 >>>>>> postal: Penn Genomics Institute >>>>>> Goddard Labs 212 >>>>>> 415 S. University Avenue >>>>>> Philadelphia, PA 19104-6017 >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------- >>>>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>>>> Strategies >>>>>> from IBM. Find simple to follow Roadmaps, straightforward >>>>>> articles, >>>>>> informative Webcasts and more! Get everything you need to get >>>>>> up to >>>>>> speed, fast. http://ads.osdn.com/? >>>>>> ad_id=7477&alloc_id=16492&op=click >>>>>> _______________________________________________ >>>>>> Gusdev-gusdev mailing list >>>>>> Gus...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>>> >>>> >>>> >>> >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is sponsored by: Discover Easy Linux Migration >>> Strategies >>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>> informative Webcasts and more! Get everything you need to get up to >>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
From: Chris S. <sto...@pc...> - 2005-07-14 22:13:18
|
No, these are different features because they are spans on different sequences (one scaffold and one virtual) so you won't get two locations based on this for the same na_feature_id. NAFeature has the na_sequence_id which tells you whether it is the scaffold or virtual sequence. If these are Gene, RNA, or Protein features then you can say that they are the same conceptual feature through the central dogma and instance tables. If they are features like Exon, then you could infer this as you say by parent_id, source_id, etc. Chris On Jul 14, 2005, at 5:52 PM, Aaron J. Mackey wrote: > > Exactly. No logic is required, because we simply copy any and all > NALocation objects attached to the sequences and generate new > NALocation objects that point to the virtual sequence, with new > coordinate/strand, but all other foreign keys remain the same (i.e. > children of the same feature). > > Hmm, that means that if you blindly pull locations for a given > feature, you will get two locations, not just one (so you'll need > to specify which reference sequence you wish to obtain the location > on). > > -Aaron > > On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: > > >> Let's see if I understand your proposal. Generate features and >> locations based on the static scaffold sequence coordinates. Then >> at the end of the pipeline generate the same (conceptual) features >> with locations based on the virtual sequence coordinates. That >> makes sense to me. The advantage is that you have both, one that >> is stable (scaffold) and one that can be regenerated as needed >> (virtual) but stored for convenience. I don't really see a >> disadvantage - sure it's twice as many rows but if you materialize >> a view you adding these anyway. >> >> Chris >> >> On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: >> >> >> >>> >>> As we struggle to use GUS the "right way", this is throwing us >>> for a loop. On the one hand, our GUS client applications want to >>> see features in the coordinate system of the assembly (i.e. the >>> virtual sequence) -- on the other hand, it makes sense from a >>> data integrity viewpoint to only load/store feature coordinates >>> with respect to the static underlying scaffold coordinates, since >>> the scaffold-to-chromosome mapping (as defined by >>> DoTS.SequencePiece) may change over time. >>> >>> One option is to instantiate a read-only materialized view of the >>> NALocation for clients to use. >>> >>> A second option (which we've just discussed, and people seem to >>> like) is for the InsertVirtualSequenceFromMapping plugin we just >>> wrote to (re)generate duplicate versions of all NALocations >>> attached to a given SequencePiece in the new coordinate system >>> (requiring the virtual sequence building to be the last step in >>> our pipeline, instead of the first). >>> >>> -Aaron >>> >>> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >>> >>> >>> >>> >>>> Hi Aaron, >>>> I don't have a strong argument for either way. In terms of >>>> coordinate mapping utilities, I'm not aware of one so certainly >>>> would welcome yours (but if others know of ones please speak up). >>>> >>>> Chris >>>> >>>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> Thanks Chris, I got it. >>>>> >>>>> If we are going to start hanging features off these, should we >>>>> hang them off the virtual chromosome sequence entries, or the >>>>> scaffold entries in externalnasequence? Would it make sense to >>>>> "codify" this usage with associate PL/SQL code to reconstruct >>>>> virtual sequence and associated features in the virtual >>>>> coordinate space? I guess one way to do this would be to have >>>>> Virtual*Feature read-only views (and thus target everything to >>>>> the "real" coordinate system such that future rebuilds of the >>>>> virtual sequence would not require recalculation of feature >>>>> locations)? >>>>> >>>>> Relatedly, is there coordinate mapping code already in some GUS >>>>> utility module (if not, I'm happy to contribute mine, based on >>>>> BioPerl's powerful Bio::Coordinate::Map framework)? >>>>> >>>>> -Aaron >>>>> >>>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> Hi Aaron, >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> 1) VirtualSequence has a required sequence_version attribute >>>>>>> - what is this for? Is this redundant to >>>>>>> external_database_release_id? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> This is a superclass attribute inherited by all NASequence >>>>>> views. My recollection is that individual GenBank sequence >>>>>> entries have version tags at the end of accessions as in >>>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette >>>>>> protein subfamily B member 3 (found in VERSION field). >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>>> pieces. Am I correct in this usage, or should I not, in >>>>>>> fact, be storing the assembled sequence in VirtualSequence? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> Again this is a superclass attribute. I think using it is >>>>>> optional. Reasons not to use it are that the virtual sequence >>>>>> is hard to represent as a single entity (e.g., contains gaps) >>>>>> or is very large and has a significant overhead cost of >>>>>> storing what can be easily regenerated (and avoid >>>>>> denormalization). Reasons to use are for convenience and >>>>>> efficiency of retrieving the sequence without the need to >>>>>> rebuild. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Aaron >>>>>>> >>>>>>> -- >>>>>>> Aaron J. Mackey, Ph.D. >>>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>>> email: am...@pc... >>>>>>> office: 215-898-1205 >>>>>>> fax: 215-746-6697 >>>>>>> postal: Penn Genomics Institute >>>>>>> Goddard Labs 212 >>>>>>> 415 S. University Avenue >>>>>>> Philadelphia, PA 19104-6017 >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------- >>>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>>> webinar happening >>>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>>> latest in dual >>>>>>> core and dual graphics technology at this free one hour event >>>>>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>>>>> www.hp.com/go/dualwebinar >>>>>>> _______________________________________________ >>>>>>> Gusdev-gusdev mailing list >>>>>>> Gus...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Aaron J. Mackey, Ph.D. >>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>> Penn Genomics Institute, University of Pennsylvania >>>>> email: am...@pc... >>>>> office: 215-898-1205 >>>>> fax: 215-746-6697 >>>>> postal: Penn Genomics Institute >>>>> Goddard Labs 212 >>>>> 415 S. University Avenue >>>>> Philadelphia, PA 19104-6017 >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>>> Strategies >>>>> from IBM. Find simple to follow Roadmaps, straightforward >>>>> articles, >>>>> informative Webcasts and more! Get everything you need to get >>>>> up to >>>>> speed, fast. http://ads.osdn.com/? >>>>> ad_id=7477&alloc_id=16492&op=click >>>>> _______________________________________________ >>>>> Gusdev-gusdev mailing list >>>>> Gus...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >>> >>> >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: Discover Easy Linux Migration >> Strategies >> from IBM. Find simple to follow Roadmaps, straightforward articles, >> informative Webcasts and more! Get everything you need to get up to >> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > |
From: Aaron J. M. <am...@pc...> - 2005-07-14 21:52:23
|
Exactly. No logic is required, because we simply copy any and all NALocation objects attached to the sequences and generate new NALocation objects that point to the virtual sequence, with new coordinate/strand, but all other foreign keys remain the same (i.e. children of the same feature). Hmm, that means that if you blindly pull locations for a given feature, you will get two locations, not just one (so you'll need to specify which reference sequence you wish to obtain the location on). -Aaron On Jul 14, 2005, at 5:41 PM, Chris Stoeckert wrote: > Let's see if I understand your proposal. Generate features and > locations based on the static scaffold sequence coordinates. Then > at the end of the pipeline generate the same (conceptual) features > with locations based on the virtual sequence coordinates. That > makes sense to me. The advantage is that you have both, one that is > stable (scaffold) and one that can be regenerated as needed > (virtual) but stored for convenience. I don't really see a > disadvantage - sure it's twice as many rows but if you materialize > a view you adding these anyway. > > Chris > > On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: > > >> >> As we struggle to use GUS the "right way", this is throwing us for >> a loop. On the one hand, our GUS client applications want to see >> features in the coordinate system of the assembly (i.e. the >> virtual sequence) -- on the other hand, it makes sense from a data >> integrity viewpoint to only load/store feature coordinates with >> respect to the static underlying scaffold coordinates, since the >> scaffold-to-chromosome mapping (as defined by DoTS.SequencePiece) >> may change over time. >> >> One option is to instantiate a read-only materialized view of the >> NALocation for clients to use. >> >> A second option (which we've just discussed, and people seem to >> like) is for the InsertVirtualSequenceFromMapping plugin we just >> wrote to (re)generate duplicate versions of all NALocations >> attached to a given SequencePiece in the new coordinate system >> (requiring the virtual sequence building to be the last step in >> our pipeline, instead of the first). >> >> -Aaron >> >> On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: >> >> >> >>> Hi Aaron, >>> I don't have a strong argument for either way. In terms of >>> coordinate mapping utilities, I'm not aware of one so certainly >>> would welcome yours (but if others know of ones please speak up). >>> >>> Chris >>> >>> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >>> >>> >>> >>> >>>> >>>> Thanks Chris, I got it. >>>> >>>> If we are going to start hanging features off these, should we >>>> hang them off the virtual chromosome sequence entries, or the >>>> scaffold entries in externalnasequence? Would it make sense to >>>> "codify" this usage with associate PL/SQL code to reconstruct >>>> virtual sequence and associated features in the virtual >>>> coordinate space? I guess one way to do this would be to have >>>> Virtual*Feature read-only views (and thus target everything to >>>> the "real" coordinate system such that future rebuilds of the >>>> virtual sequence would not require recalculation of feature >>>> locations)? >>>> >>>> Relatedly, is there coordinate mapping code already in some GUS >>>> utility module (if not, I'm happy to contribute mine, based on >>>> BioPerl's powerful Bio::Coordinate::Map framework)? >>>> >>>> -Aaron >>>> >>>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>>> >>>> >>>> >>>> >>>> >>>>> Hi Aaron, >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> 1) VirtualSequence has a required sequence_version attribute - >>>>>> what is this for? Is this redundant to >>>>>> external_database_release_id? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> This is a superclass attribute inherited by all NASequence >>>>> views. My recollection is that individual GenBank sequence >>>>> entries have version tags at the end of accessions as in >>>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette protein >>>>> subfamily B member 3 (found in VERSION field). >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> 2) VirtualSequence has a clob for storing the assembled >>>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>>> this slot, instead rebuilding the sequence from the sequence >>>>>> pieces. Am I correct in this usage, or should I not, in fact, >>>>>> be storing the assembled sequence in VirtualSequence? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> Again this is a superclass attribute. I think using it is >>>>> optional. Reasons not to use it are that the virtual sequence >>>>> is hard to represent as a single entity (e.g., contains gaps) >>>>> or is very large and has a significant overhead cost of storing >>>>> what can be easily regenerated (and avoid denormalization). >>>>> Reasons to use are for convenience and efficiency of retrieving >>>>> the sequence without the need to rebuild. >>>>> >>>>> Chris >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Aaron >>>>>> >>>>>> -- >>>>>> Aaron J. Mackey, Ph.D. >>>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>>> Penn Genomics Institute, University of Pennsylvania >>>>>> email: am...@pc... >>>>>> office: 215-898-1205 >>>>>> fax: 215-746-6697 >>>>>> postal: Penn Genomics Institute >>>>>> Goddard Labs 212 >>>>>> 415 S. University Avenue >>>>>> Philadelphia, PA 19104-6017 >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------- >>>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>>> webinar happening >>>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>>> latest in dual >>>>>> core and dual graphics technology at this free one hour event >>>>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>>>> www.hp.com/go/dualwebinar >>>>>> _______________________________________________ >>>>>> Gusdev-gusdev mailing list >>>>>> Gus...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF.Net email is sponsored by: Discover Easy Linux Migration >>>> Strategies >>>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>>> informative Webcasts and more! Get everything you need to get up to >>>> speed, fast. http://ads.osdn.com/? >>>> ad_id=7477&alloc_id=16492&op=click >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>>> >>> >>> >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> >> > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
From: Chris S. <sto...@pc...> - 2005-07-14 21:41:33
|
Let's see if I understand your proposal. Generate features and locations based on the static scaffold sequence coordinates. Then at the end of the pipeline generate the same (conceptual) features with locations based on the virtual sequence coordinates. That makes sense to me. The advantage is that you have both, one that is stable (scaffold) and one that can be regenerated as needed (virtual) but stored for convenience. I don't really see a disadvantage - sure it's twice as many rows but if you materialize a view you adding these anyway. Chris On Jul 14, 2005, at 3:50 PM, Aaron J. Mackey wrote: > > As we struggle to use GUS the "right way", this is throwing us for > a loop. On the one hand, our GUS client applications want to see > features in the coordinate system of the assembly (i.e. the virtual > sequence) -- on the other hand, it makes sense from a data > integrity viewpoint to only load/store feature coordinates with > respect to the static underlying scaffold coordinates, since the > scaffold-to-chromosome mapping (as defined by DoTS.SequencePiece) > may change over time. > > One option is to instantiate a read-only materialized view of the > NALocation for clients to use. > > A second option (which we've just discussed, and people seem to > like) is for the InsertVirtualSequenceFromMapping plugin we just > wrote to (re)generate duplicate versions of all NALocations > attached to a given SequencePiece in the new coordinate system > (requiring the virtual sequence building to be the last step in our > pipeline, instead of the first). > > -Aaron > > On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: > > >> Hi Aaron, >> I don't have a strong argument for either way. In terms of >> coordinate mapping utilities, I'm not aware of one so certainly >> would welcome yours (but if others know of ones please speak up). >> >> Chris >> >> On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: >> >> >> >>> >>> Thanks Chris, I got it. >>> >>> If we are going to start hanging features off these, should we >>> hang them off the virtual chromosome sequence entries, or the >>> scaffold entries in externalnasequence? Would it make sense to >>> "codify" this usage with associate PL/SQL code to reconstruct >>> virtual sequence and associated features in the virtual >>> coordinate space? I guess one way to do this would be to have >>> Virtual*Feature read-only views (and thus target everything to >>> the "real" coordinate system such that future rebuilds of the >>> virtual sequence would not require recalculation of feature >>> locations)? >>> >>> Relatedly, is there coordinate mapping code already in some GUS >>> utility module (if not, I'm happy to contribute mine, based on >>> BioPerl's powerful Bio::Coordinate::Map framework)? >>> >>> -Aaron >>> >>> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >>> >>> >>> >>> >>>> Hi Aaron, >>>> >>>> >>>> >>>> >>>> >>>>> 1) VirtualSequence has a required sequence_version attribute - >>>>> what is this for? Is this redundant to >>>>> external_database_release_id? >>>>> >>>>> >>>>> >>>>> >>>> This is a superclass attribute inherited by all NASequence >>>> views. My recollection is that individual GenBank sequence >>>> entries have version tags at the end of accessions as in >>>> "DQ094190.1" for Toxoplasma gondii ATP-binding cassette protein >>>> subfamily B member 3 (found in VERSION field). >>>> >>>> >>>> >>>> >>>> >>>>> 2) VirtualSequence has a clob for storing the assembled >>>>> sequence (I suspect), but the Perl object layer doesn't use >>>>> this slot, instead rebuilding the sequence from the sequence >>>>> pieces. Am I correct in this usage, or should I not, in fact, >>>>> be storing the assembled sequence in VirtualSequence? >>>>> >>>>> >>>>> >>>>> >>>> >>>> Again this is a superclass attribute. I think using it is >>>> optional. Reasons not to use it are that the virtual sequence is >>>> hard to represent as a single entity (e.g., contains gaps) or is >>>> very large and has a significant overhead cost of storing what >>>> can be easily regenerated (and avoid denormalization). Reasons >>>> to use are for convenience and efficiency of retrieving the >>>> sequence without the need to rebuild. >>>> >>>> Chris >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> Thanks, >>>>> >>>>> -Aaron >>>>> >>>>> -- >>>>> Aaron J. Mackey, Ph.D. >>>>> Project Manager, ApiDB Bioinformatics Resource Center >>>>> Penn Genomics Institute, University of Pennsylvania >>>>> email: am...@pc... >>>>> office: 215-898-1205 >>>>> fax: 215-746-6697 >>>>> postal: Penn Genomics Institute >>>>> Goddard Labs 212 >>>>> 415 S. University Avenue >>>>> Philadelphia, PA 19104-6017 >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>>> webinar happening >>>>> July 14 at 8am PDT/11am EDT. We invite you to explore the >>>>> latest in dual >>>>> core and dual graphics technology at this free one hour event >>>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>>> www.hp.com/go/dualwebinar >>>>> _______________________________________________ >>>>> Gusdev-gusdev mailing list >>>>> Gus...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is sponsored by: Discover Easy Linux Migration >>> Strategies >>> from IBM. Find simple to follow Roadmaps, straightforward articles, >>> informative Webcasts and more! Get everything you need to get up to >>> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >> >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > |
From: Aaron J. M. <am...@pc...> - 2005-07-14 19:50:09
|
As we struggle to use GUS the "right way", this is throwing us for a loop. On the one hand, our GUS client applications want to see features in the coordinate system of the assembly (i.e. the virtual sequence) -- on the other hand, it makes sense from a data integrity viewpoint to only load/store feature coordinates with respect to the static underlying scaffold coordinates, since the scaffold-to- chromosome mapping (as defined by DoTS.SequencePiece) may change over time. One option is to instantiate a read-only materialized view of the NALocation for clients to use. A second option (which we've just discussed, and people seem to like) is for the InsertVirtualSequenceFromMapping plugin we just wrote to (re)generate duplicate versions of all NALocations attached to a given SequencePiece in the new coordinate system (requiring the virtual sequence building to be the last step in our pipeline, instead of the first). -Aaron On Jul 14, 2005, at 2:53 PM, Chris Stoeckert wrote: > Hi Aaron, > I don't have a strong argument for either way. In terms of > coordinate mapping utilities, I'm not aware of one so certainly > would welcome yours (but if others know of ones please speak up). > > Chris > > On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: > > >> >> Thanks Chris, I got it. >> >> If we are going to start hanging features off these, should we >> hang them off the virtual chromosome sequence entries, or the >> scaffold entries in externalnasequence? Would it make sense to >> "codify" this usage with associate PL/SQL code to reconstruct >> virtual sequence and associated features in the virtual coordinate >> space? I guess one way to do this would be to have >> Virtual*Feature read-only views (and thus target everything to the >> "real" coordinate system such that future rebuilds of the virtual >> sequence would not require recalculation of feature locations)? >> >> Relatedly, is there coordinate mapping code already in some GUS >> utility module (if not, I'm happy to contribute mine, based on >> BioPerl's powerful Bio::Coordinate::Map framework)? >> >> -Aaron >> >> On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: >> >> >> >>> Hi Aaron, >>> >>> >>> >>> >>>> 1) VirtualSequence has a required sequence_version attribute - >>>> what is this for? Is this redundant to >>>> external_database_release_id? >>>> >>>> >>>> >>> This is a superclass attribute inherited by all NASequence views. >>> My recollection is that individual GenBank sequence entries have >>> version tags at the end of accessions as in "DQ094190.1" for >>> Toxoplasma gondii ATP-binding cassette protein subfamily B member >>> 3 (found in VERSION field). >>> >>> >>> >>> >>>> 2) VirtualSequence has a clob for storing the assembled sequence >>>> (I suspect), but the Perl object layer doesn't use this slot, >>>> instead rebuilding the sequence from the sequence pieces. Am I >>>> correct in this usage, or should I not, in fact, be storing the >>>> assembled sequence in VirtualSequence? >>>> >>>> >>>> >>> >>> Again this is a superclass attribute. I think using it is >>> optional. Reasons not to use it are that the virtual sequence is >>> hard to represent as a single entity (e.g., contains gaps) or is >>> very large and has a significant overhead cost of storing what >>> can be easily regenerated (and avoid denormalization). Reasons to >>> use are for convenience and efficiency of retrieving the sequence >>> without the need to rebuild. >>> >>> Chris >>> >>> >>> >>> >>> >>>> >>>> Thanks, >>>> >>>> -Aaron >>>> >>>> -- >>>> Aaron J. Mackey, Ph.D. >>>> Project Manager, ApiDB Bioinformatics Resource Center >>>> Penn Genomics Institute, University of Pennsylvania >>>> email: am...@pc... >>>> office: 215-898-1205 >>>> fax: 215-746-6697 >>>> postal: Penn Genomics Institute >>>> Goddard Labs 212 >>>> 415 S. University Avenue >>>> Philadelphia, PA 19104-6017 >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> This SF.Net email is sponsored by the 'Do More With Dual!' >>>> webinar happening >>>> July 14 at 8am PDT/11am EDT. We invite you to explore the latest >>>> in dual >>>> core and dual graphics technology at this free one hour event >>>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>>> www.hp.com/go/dualwebinar >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>>> >>> >>> >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> >> >> >> ------------------------------------------------------- >> SF.Net email is sponsored by: Discover Easy Linux Migration >> Strategies >> from IBM. Find simple to follow Roadmaps, straightforward articles, >> informative Webcasts and more! Get everything you need to get up to >> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
From: Chris S. <sto...@pc...> - 2005-07-14 18:53:02
|
Hi Aaron, I don't have a strong argument for either way. In terms of coordinate mapping utilities, I'm not aware of one so certainly would welcome yours (but if others know of ones please speak up). Chris On Jul 14, 2005, at 11:13 AM, Aaron J. Mackey wrote: > > Thanks Chris, I got it. > > If we are going to start hanging features off these, should we hang > them off the virtual chromosome sequence entries, or the scaffold > entries in externalnasequence? Would it make sense to "codify" > this usage with associate PL/SQL code to reconstruct virtual > sequence and associated features in the virtual coordinate space? > I guess one way to do this would be to have Virtual*Feature read- > only views (and thus target everything to the "real" coordinate > system such that future rebuilds of the virtual sequence would not > require recalculation of feature locations)? > > Relatedly, is there coordinate mapping code already in some GUS > utility module (if not, I'm happy to contribute mine, based on > BioPerl's powerful Bio::Coordinate::Map framework)? > > -Aaron > > On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: > > >> Hi Aaron, >> >> >> >>> 1) VirtualSequence has a required sequence_version attribute - >>> what is this for? Is this redundant to >>> external_database_release_id? >>> >>> >> This is a superclass attribute inherited by all NASequence views. >> My recollection is that individual GenBank sequence entries have >> version tags at the end of accessions as in "DQ094190.1" for >> Toxoplasma gondii ATP-binding cassette protein subfamily B member >> 3 (found in VERSION field). >> >> >> >>> 2) VirtualSequence has a clob for storing the assembled sequence >>> (I suspect), but the Perl object layer doesn't use this slot, >>> instead rebuilding the sequence from the sequence pieces. Am I >>> correct in this usage, or should I not, in fact, be storing the >>> assembled sequence in VirtualSequence? >>> >>> >> >> Again this is a superclass attribute. I think using it is >> optional. Reasons not to use it are that the virtual sequence is >> hard to represent as a single entity (e.g., contains gaps) or is >> very large and has a significant overhead cost of storing what can >> be easily regenerated (and avoid denormalization). Reasons to use >> are for convenience and efficiency of retrieving the sequence >> without the need to rebuild. >> >> Chris >> >> >> >> >>> >>> Thanks, >>> >>> -Aaron >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.Net email is sponsored by the 'Do More With Dual!' >>> webinar happening >>> July 14 at 8am PDT/11am EDT. We invite you to explore the latest >>> in dual >>> core and dual graphics technology at this free one hour event >>> hosted by HP,AMD, and NVIDIA. To register visit http:// >>> www.hp.com/go/dualwebinar >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >> >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Aaron J. M. <am...@pc...> - 2005-07-14 15:13:47
|
Thanks Chris, I got it. If we are going to start hanging features off these, should we hang them off the virtual chromosome sequence entries, or the scaffold entries in externalnasequence? Would it make sense to "codify" this usage with associate PL/SQL code to reconstruct virtual sequence and associated features in the virtual coordinate space? I guess one way to do this would be to have Virtual*Feature read-only views (and thus target everything to the "real" coordinate system such that future rebuilds of the virtual sequence would not require recalculation of feature locations)? Relatedly, is there coordinate mapping code already in some GUS utility module (if not, I'm happy to contribute mine, based on BioPerl's powerful Bio::Coordinate::Map framework)? -Aaron On Jul 14, 2005, at 11:05 AM, Chris Stoeckert wrote: > Hi Aaron, > > >> 1) VirtualSequence has a required sequence_version attribute - >> what is this for? Is this redundant to external_database_release_id? >> > This is a superclass attribute inherited by all NASequence views. > My recollection is that individual GenBank sequence entries have > version tags at the end of accessions as in "DQ094190.1" for > Toxoplasma gondii ATP-binding cassette protein subfamily B member 3 > (found in VERSION field). > > >> 2) VirtualSequence has a clob for storing the assembled sequence >> (I suspect), but the Perl object layer doesn't use this slot, >> instead rebuilding the sequence from the sequence pieces. Am I >> correct in this usage, or should I not, in fact, be storing the >> assembled sequence in VirtualSequence? >> > > Again this is a superclass attribute. I think using it is optional. > Reasons not to use it are that the virtual sequence is hard to > represent as a single entity (e.g., contains gaps) or is very large > and has a significant overhead cost of storing what can be easily > regenerated (and avoid denormalization). Reasons to use are for > convenience and efficiency of retrieving the sequence without the > need to rebuild. > > Chris > > > >> >> Thanks, >> >> -Aaron >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by the 'Do More With Dual!' webinar >> happening >> July 14 at 8am PDT/11am EDT. We invite you to explore the latest >> in dual >> core and dual graphics technology at this free one hour event >> hosted by HP,AMD, and NVIDIA. To register visit http://www.hp.com/ >> go/dualwebinar >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
From: Chris S. <sto...@pc...> - 2005-07-14 15:05:14
|
Hi Aaron, > 1) VirtualSequence has a required sequence_version attribute - what > is this for? Is this redundant to external_database_release_id? This is a superclass attribute inherited by all NASequence views. My recollection is that individual GenBank sequence entries have version tags at the end of accessions as in "DQ094190.1" for Toxoplasma gondii ATP-binding cassette protein subfamily B member 3 (found in VERSION field). > 2) VirtualSequence has a clob for storing the assembled sequence (I > suspect), but the Perl object layer doesn't use this slot, instead > rebuilding the sequence from the sequence pieces. Am I correct in > this usage, or should I not, in fact, be storing the assembled > sequence in VirtualSequence? Again this is a superclass attribute. I think using it is optional. Reasons not to use it are that the virtual sequence is hard to represent as a single entity (e.g., contains gaps) or is very large and has a significant overhead cost of storing what can be easily regenerated (and avoid denormalization). Reasons to use are for convenience and efficiency of retrieving the sequence without the need to rebuild. Chris > > Thanks, > > -Aaron > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > > > > ------------------------------------------------------- > This SF.Net email is sponsored by the 'Do More With Dual!' webinar > happening > July 14 at 8am PDT/11am EDT. We invite you to explore the latest in > dual > core and dual graphics technology at this free one hour event > hosted by HP,AMD, and NVIDIA. To register visit http://www.hp.com/ > go/dualwebinar > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
From: Aaron J. M. <am...@pc...> - 2005-07-14 14:12:51
|
We're using (perhaps incorrectly) the VirtualSequence and SequencePiece tables to represent chromosomal assemblies of scaffolds (stored in ExternalNASequence). 1) VirtualSequence has a required sequence_version attribute - what is this for? Is this redundant to external_database_release_id? 2) VirtualSequence has a clob for storing the assembled sequence (I suspect), but the Perl object layer doesn't use this slot, instead rebuilding the sequence from the sequence pieces. Am I correct in this usage, or should I not, in fact, be storing the assembled sequence in VirtualSequence? Thanks, -Aaron -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
From: Michael S. <msa...@pc...> - 2005-07-13 18:19:34
|
-- GUS Workshop Discussion Review * Documentation and Usability: Several additional types of documentation were identified as a means to making the schema more accessible as well as moving towards a common semantic understanding. These types of documentation include an "SQL Cookbook" for common queries, ER diagrams, and improvements to the Schema Browser that would allow for web-based updating and modification of table and attribute level documentation, as well as use cases, in a wiki-type style. The existing wiki was mentioned as a very good location for user submitted comments, notes, etc. It is anticipated that documentation would flow from the schema browser, wiki, and other sources to the official User's Guide, making that resource increasingly valuable as a single point of reference for using GUS. Action Item: Mike will be improving the schema browser to support working with the documentation. Action Item: The hand edited object documentation needs to be improved. Action Item: Mike will move the Plugin API documentation (Brian Brunk's document) into the Developer's Guide. * 3.5 Migration Three major approaches to upgrade to 3.5 were discussed: an in place upgrade using SQL scripts, a migrate and transform process from a 3.0 database to a new 3.5 database, and starting over with a new 3.5 database and repopulating using source data. Almost all groups identified the last option as unfeasible. The group agreed that the ultimate decision between the first two approaches will depend on the magnitude and type of changes, and that there was not yet a good grasp on those. Further, some groups may choose one approach while others choose another. Action Item: Mike will review existing tools that compare schemas and provide reports. (Done: I looked at TOAD and Oracle's OEM, and ultimately decided that the functionality GUS can provide should be sufficient and we'll additionally benefit being able to address GUS specifics). Action Item: Mike will publish the list of 3.0->3.5 changes. * Project Management The group discussed and agreed on the importance of using the "Bugzilla" issue tracking system for managing bugs and GUS changes. The development flow then starts with an issue being created in the issue tracker. This issue may have been preceded (in the mailing list) or be proceeded (through the issue tracker comments) by a discussion on the merits of the change. The change will ultimately be accepted or rejected by a single individual who "owns" the component that the issue affects. (This ownership will be assigned automatically when the issue is created in the tracker). Once approved or rejected, the appropriate changes will be made in the source repository (applying the patch or confirming the patch for approved changes, or doing nothing or removing the patch for rejected changes that were committed to svn). GUS releases will occur on a more frequent basis, perhaps as regularly as monthly, with clear upgrade instructions for groups that prefer to upgrade less frequently. The group discussed and agreed to a proposal to manage the GUS Schema and application framework as separate projects (i.e. with separate version numbers) to simplify dependencies and upgrading. Action Item: Move the GUS schema to a separate project. Action Item: Done: A list of schema and component owners have been compiled and will be sent to the list. Action Item: Mike will modify the tracker to support emailing reminders and email issue submissions. * Schema Discussions The importance and wisdom of the placement of the housekeeping columns was discussed, which possible solutions being changing the requirement (really, auditing code to ensure that the order isn't assumed), or using a view layer. The use of global identifiers was debated, without a clear resolution, but with a general consensus that more discussion was needed. A variety of schema clarifications were discussed primarily focusing on DoTS and SRes. These can be used to frame use cases for documentation purposes. * Resources Repository The resources repository was discussed, particularity whether it should be shared among the BRC groups and/or with the GUS community. * Tools and Application adapters Several tools and applications were identified that the community would like to use with GUS: Microarray Tools, Manatee, Apollo, Artemis, manual curation tools, tools for working with clinical data, and proteomic data tools (Wastling). * Hibernate A small working group had a side discussion to get Hibernate running with GUS. The first milestone will be generation of mapping files using the xml schema. Once complete, a proof-of-concept application will be developed, which should also help to identify what GUS functionality hibernate will need to provide. |
From: Angel P. <an...@ma...> - 2005-07-13 15:05:41
|
Hello Folks, Trying to see if this would work on a windows system with Cygwin and PostgreSQL installed gives the highly informative error message: angel@gort ~/project_home/GUS $ build GUS install -append -installDBSchema Build failed I am positive I can connect to my DB instance using perl and the same parameters I used in the $PROJECT_HOME/install/gus.config file. All dependencies have been installed and run as expected. Note that I don't think this is an issue we should solve (the only usage I saw was to configure a gus system on a laptop running windows for development purposes). I do think that we should mention in the install docs that this combination does not currently work. Maybe provide a table of successfully installed configurations? -angel |
From: Kumar, S. \(Contr\) <San...@ng...> - 2005-07-13 14:32:40
|
Here is the output I got after running which generateGusObjects : /home/oragus35/GUS/gus_home/bin/generateGusObjects Thanks Sanjeev -----Original Message----- From: Michael Saffitz [mailto:msa...@pc...] Sent: Wednesday, July 13, 2005 10:24 AM To: Kumar, Sanjeev (Contr); Gusdev gusdev-gusdev Subject: Re: [GUSDEV] FW: gus 3.5 installation error Hi Sanjeev, On 7/13/05 10:19 AM, "Kumar, Sanjeev (Contr)" <San...@ng...> = wrote: > Hi Mike, > Good morning! > Pl. find the answer to your questions: > 1. Which generateGusObjects: > I am not not able to figure out the objectName , can you pl. = tell me > where to find? Just use the "which generateGusObjects" command. It should look like = this: msaffitz:~ msaffitz$ which generateGusObjects /Users/msaffitz/cvswork/gushome35/bin/generateGusObjects Please provide the full output. Also, I don't know if GUS has ever been installed on perl 5.8.7-- I use 5.8.6, so that may be a cause for = further investigation as well. --Thanks, Mike > 2. echo $PROJECT_HOME: > /home/oragus35/GUS/project_home > 3. echo $GUS_HOME: > /home/oragus35/GUS/gus_home > 4. echo $PATH: > =20 > = /home/oragus35/GUS/gus_home/bin:/home/oragus35/GUS/project_home/install/b= in:/h > ome/oragus35/apache-ant-1.6.5/bin > =20 >=20 :/home/oragus35/j2sdk1.4.2_08/bin:/home/oragus35/perl:/home/oragus35/perl= -5.8.> 7 > :/home/oragus35/perl/bin > =20 > = :/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/orag= us35/ > bin > 5. echo $GUS_CONFIG_FILE > /home/oragus35/GUS/gus.properties > =20 > =20 > =20 > Thanks > Sanjeev > =20 >=20 > -----Original Message----- > From: Michael Saffitz [mailto:msa...@pc...] > Sent: Tuesday, July 12, 2005 8:22 PM > To: Kumar, Sanjeev (Contr); Gusdev gusdev-gusdev > Subject: Re: [GUSDEV] FW: gus 3.5 installation error >=20 >=20 >=20 > Hi Sanjeev, >=20 > Can you provide the following: >=20 > Results of: >=20 > $ which generateGusObjects > $ echo $PROJECT_HOME > $ echo $GUS_HOME > $ echo $PATH > $ echo $GUS_CONFIG_FILE >=20 > (don't type the $ sign) >=20 > --Mike >=20 >=20 >=20 >=20 > On 7/12/05 7:08 PM, "Kumar, Sanjeev (Contr)" <San...@ng...> = wrote: >=20 >>=20 >>=20 >> -----Original Message----- >> From: Kumar, Sanjeev (Contr) >> Sent: Tuesday, July 12, 2005 7:08 PM >> To: 'gus...@li...' >> Subject: RE: gus 3.5 installation error >>=20 >>=20 >>=20 >> Hi, >> I am getting following error while doing the GUS3.5 installation. >> It has created schema object successfully , but at the time of = creation of >> perl object it is throwing error message. >> Can anyone help me ? >>=20 >> Thanks >> Sanjeev >>=20 >> [oragus35@rdevse02 oragus35]$ build GUS install >> -append=20 >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask writeGusProp >> [WritePropertyFiles] INFO: Skipping creation of >> GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties -- >> already exists >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask >> writeInstallProp >> [WritePropertyFiles] INFO: Recreating install.prop >> file >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask >> writePluginProp >> [WritePropertyFiles] INFO: Recreating >> GUS-PluginMgr.prop file >> [copy] Copying 1 file to >> /home/oragus35/GUS/gus_home/config >> [copy] Copying 1 file to >> /home/oragus35/GUS/gus_home/config >> [echo] . >> [echo] Installing CBIL/Bio >> [echo] . >> [echo] Installing CBIL/CSP >> [echo] . >> [echo] Installing CBIL/Util >> [echo] . >> [echo] Installing CBIL/HQ >> [echo] . >> [echo] Installing CBIL/ObjectMapper >> [concat] No existing files and no nested text, >> doing nothing >> [echo] . >> [echo] Installing GUS/Supported >> [echo] . >> [echo] Installing GUS/Community >> [echo] . >> [echo] Installing GUS/DBAdmin >> [echo] . >> [echo] Installing GUS/GOPredict >> [echo] . >> [echo] Installing GUS/ObjRelP >> [echo] generating Perl Objects >> =20 >> BUILD FAILED >> /home/oragus35/GUS/project_home/install/build.xml:28: >> The following error occurred while executing this >> line: >> /home/oragus35/GUS/project_home/GUS/build.xml:225: >> exec returned: -1 >> =20 >> Total time: 7 seconds >>=20 >>=20 >> __________________________________________________ >> Do You Yahoo!? >> Tired of spam? Yahoo! Mail has the best spam protection around >> http://mail.yahoo.com >>=20 >>=20 >> ------------------------------------------------------- >> This SF.Net email is sponsored by the 'Do More With Dual!' webinar = happening >> July 14 at 8am PDT/11am EDT. We invite you to explore the latest in = dual >> core and dual graphics technology at this free one hour event hosted = by HP, >> AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >=20 >=20 |
From: Michael S. <msa...@pc...> - 2005-07-13 14:24:05
|
Hi Sanjeev, On 7/13/05 10:19 AM, "Kumar, Sanjeev (Contr)" <San...@ng...> wrote: > Hi Mike, > Good morning! > Pl. find the answer to your questions: > 1. Which generateGusObjects: > I am not not able to figure out the objectName , can you pl. tell me > where to find? Just use the "which generateGusObjects" command. It should look like this: msaffitz:~ msaffitz$ which generateGusObjects /Users/msaffitz/cvswork/gushome35/bin/generateGusObjects Please provide the full output. Also, I don't know if GUS has ever been installed on perl 5.8.7-- I use 5.8.6, so that may be a cause for further investigation as well. --Thanks, Mike > 2. echo $PROJECT_HOME: > /home/oragus35/GUS/project_home > 3. echo $GUS_HOME: > /home/oragus35/GUS/gus_home > 4. echo $PATH: > > /home/oragus35/GUS/gus_home/bin:/home/oragus35/GUS/project_home/install/bin:/h > ome/oragus35/apache-ant-1.6.5/bin > > :/home/oragus35/j2sdk1.4.2_08/bin:/home/oragus35/perl:/home/oragus35/perl-5.8.> 7 > :/home/oragus35/perl/bin > > :/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/oragus35/ > bin > 5. echo $GUS_CONFIG_FILE > /home/oragus35/GUS/gus.properties > > > > Thanks > Sanjeev > > > -----Original Message----- > From: Michael Saffitz [mailto:msa...@pc...] > Sent: Tuesday, July 12, 2005 8:22 PM > To: Kumar, Sanjeev (Contr); Gusdev gusdev-gusdev > Subject: Re: [GUSDEV] FW: gus 3.5 installation error > > > > Hi Sanjeev, > > Can you provide the following: > > Results of: > > $ which generateGusObjects > $ echo $PROJECT_HOME > $ echo $GUS_HOME > $ echo $PATH > $ echo $GUS_CONFIG_FILE > > (don't type the $ sign) > > --Mike > > > > > On 7/12/05 7:08 PM, "Kumar, Sanjeev (Contr)" <San...@ng...> wrote: > >> >> >> -----Original Message----- >> From: Kumar, Sanjeev (Contr) >> Sent: Tuesday, July 12, 2005 7:08 PM >> To: 'gus...@li...' >> Subject: RE: gus 3.5 installation error >> >> >> >> Hi, >> I am getting following error while doing the GUS3.5 installation. >> It has created schema object successfully , but at the time of creation of >> perl object it is throwing error message. >> Can anyone help me ? >> >> Thanks >> Sanjeev >> >> [oragus35@rdevse02 oragus35]$ build GUS install >> -append >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask writeGusProp >> [WritePropertyFiles] INFO: Skipping creation of >> GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties -- >> already exists >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask >> writeInstallProp >> [WritePropertyFiles] INFO: Recreating install.prop >> file >> [WritePropertyFiles] Jul 12, 2005 7:10:43 PM >> org.gusdb.install.WritePropertyFileTask >> writePluginProp >> [WritePropertyFiles] INFO: Recreating >> GUS-PluginMgr.prop file >> [copy] Copying 1 file to >> /home/oragus35/GUS/gus_home/config >> [copy] Copying 1 file to >> /home/oragus35/GUS/gus_home/config >> [echo] . >> [echo] Installing CBIL/Bio >> [echo] . >> [echo] Installing CBIL/CSP >> [echo] . >> [echo] Installing CBIL/Util >> [echo] . >> [echo] Installing CBIL/HQ >> [echo] . >> [echo] Installing CBIL/ObjectMapper >> [concat] No existing files and no nested text, >> doing nothing >> [echo] . >> [echo] Installing GUS/Supported >> [echo] . >> [echo] Installing GUS/Community >> [echo] . >> [echo] Installing GUS/DBAdmin >> [echo] . >> [echo] Installing GUS/GOPredict >> [echo] . >> [echo] Installing GUS/ObjRelP >> [echo] generating Perl Objects >> >> BUILD FAILED >> /home/oragus35/GUS/project_home/install/build.xml:28: >> The following error occurred while executing this >> line: >> /home/oragus35/GUS/project_home/GUS/build.xml:225: >> exec returned: -1 >> >> Total time: 7 seconds >> >> >> __________________________________________________ >> Do You Yahoo!? >> Tired of spam? Yahoo! Mail has the best spam protection around >> http://mail.yahoo.com >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening >> July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual >> core and dual graphics technology at this free one hour event hosted by HP, >> AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Kumar, S. \(Contr\) <San...@ng...> - 2005-07-13 14:19:32
|
Hi Mike, Good morning! Pl. find the answer to your questions: 1. Which generateGusObjects: I am not not able to figure out the objectName , can you pl. tell = me where to find? 2. echo $PROJECT_HOME: /home/oragus35/GUS/project_home 3. echo $GUS_HOME: /home/oragus35/GUS/gus_home 4. echo $PATH: = /home/oragus35/GUS/gus_home/bin:/home/oragus35/GUS/project_home/install/b= in:/home/oragus35/apache-ant-1.6.5/bin = :/home/oragus35/j2sdk1.4.2_08/bin:/home/oragus35/perl:/home/oragus35/perl= -5.8.7 :/home/oragus35/perl/bin = :/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/orag= us35/bin 5. echo $GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties =20 =20 =20 Thanks Sanjeev =20 -----Original Message----- From: Michael Saffitz [mailto:msa...@pc...] Sent: Tuesday, July 12, 2005 8:22 PM To: Kumar, Sanjeev (Contr); Gusdev gusdev-gusdev Subject: Re: [GUSDEV] FW: gus 3.5 installation error Hi Sanjeev, Can you provide the following: Results of: $ which generateGusObjects $ echo $PROJECT_HOME $ echo $GUS_HOME $ echo $PATH $ echo $GUS_CONFIG_FILE (don't type the $ sign) --Mike On 7/12/05 7:08 PM, "Kumar, Sanjeev (Contr)" <San...@ng...> = wrote: >=20 >=20 > -----Original Message----- > From: Kumar, Sanjeev (Contr) > Sent: Tuesday, July 12, 2005 7:08 PM > To: 'gus...@li...' > Subject: RE: gus 3.5 installation error >=20 >=20 >=20 > Hi, > I am getting following error while doing the GUS3.5 installation. > It has created schema object successfully , but at the time of = creation of > perl object it is throwing error message. > Can anyone help me ? >=20 > Thanks > Sanjeev >=20 > [oragus35@rdevse02 oragus35]$ build GUS install > -append=20 > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask writeGusProp > [WritePropertyFiles] INFO: Skipping creation of > GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties -- > already exists > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask > writeInstallProp > [WritePropertyFiles] INFO: Recreating install.prop > file > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask > writePluginProp > [WritePropertyFiles] INFO: Recreating > GUS-PluginMgr.prop file > [copy] Copying 1 file to > /home/oragus35/GUS/gus_home/config > [copy] Copying 1 file to > /home/oragus35/GUS/gus_home/config > [echo] . > [echo] Installing CBIL/Bio > [echo] . > [echo] Installing CBIL/CSP > [echo] . > [echo] Installing CBIL/Util > [echo] . > [echo] Installing CBIL/HQ > [echo] . > [echo] Installing CBIL/ObjectMapper > [concat] No existing files and no nested text, > doing nothing > [echo] . > [echo] Installing GUS/Supported > [echo] . > [echo] Installing GUS/Community > [echo] . > [echo] Installing GUS/DBAdmin > [echo] . > [echo] Installing GUS/GOPredict > [echo] . > [echo] Installing GUS/ObjRelP > [echo] generating Perl Objects > =20 > BUILD FAILED > /home/oragus35/GUS/project_home/install/build.xml:28: > The following error occurred while executing this > line: > /home/oragus35/GUS/project_home/GUS/build.xml:225: > exec returned: -1 > =20 > Total time: 7 seconds >=20 >=20 > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by the 'Do More With Dual!' webinar = happening > July 14 at 8am PDT/11am EDT. We invite you to explore the latest in = dual > core and dual graphics technology at this free one hour event hosted = by HP, > AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Steve F. <sfi...@pc...> - 2005-07-13 02:01:05
|
Fabricio- Please look in $PROJECT_HOME/GUS/Supported/plugins/perl for the set of=20 supported plugins. I think you will find the answer there. If you have any trouble at all=20 write back. Steve Fabr=EDcio wrote: >Spam detection software, running on the system "recreio.de9.ime.eb.br", = has >identified this incoming email as possible spam. The original message >has been attached to this so you can view it (if it isn't spam) or block >similar future email. If you have any questions, see >the administrator of that system for details. > >Content preview: Hello all, We would like to thank all of you for GUS > Installation help. Now we have a 3.5 version installed ok! [...]=20 > >Content analysis details: (5.2 points, 5.0 required) > > pts rule name description >---- ---------------------- --------------------------------------------= ------ > 0.5 BR_RECEIVED_SPAMMER Received com endereco DSL ou Dial-Up de Spam= mers > 0.1 HTML_60_70 BODY: Message is 60% to 70% HTML > 0.0 HTML_MESSAGE BODY: HTML included in message > 1.1 RCVD_IN_SORBS_SOCKS RBL: SORBS: sender is open SOCKS proxy serve= r > [200.165.172.249 listed in dnsbl.sorbs.net] > 1.1 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy server > [200.165.172.249 listed in dnsbl.sorbs.net] > 1.1 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org > [<http://dsbl.org/listing?200.165.172.249>] > 1.1 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy > [200.165.172.249 listed in dnsbl.njabl.org] > 0.1 RCVD_IN_SORBS RBL: SORBS: sender is listed in SORBS > [200.165.172.249 listed in dnsbl.sorbs.net] > 0.1 RCVD_IN_NJABL RBL: Received via a relay in dnsbl.njabl.org > [200.165.172.249 listed in dnsbl.njabl.org] > >The original message was not completely plain text, and may be unsafe to >open with some email clients; in particular, it may contain a virus, >or confirm that your address can receive spam. If you wish to view >it, it may be safer to save it to a file and open it with an editor. > > =20 > > > -----------------------------------------------------------------------= - > > Subject: > New plugins > From: > Fabr=EDcio <fab...@de...> > Date: > Tue, 12 Jul 2005 18:35:19 -0300 > To: > <gus...@li...> > > To: > <gus...@li...> > > > Hello all, > > We would like to thank all of you for GUS Installation help. Now we=20 > have a 3.5 version installed ok! > > Our doubts now are about the Plugins. As we noticed some plugins to=20 > load data seem to be modified. For instance, we would like to load=20 > ExternalDatabase.xml and ExternalDatabaseRelease.xml files into the=20 > respective tables and we noticed that the UpdateGusFromXML Plugin=20 > demonstrated in the Boostrap data wiki isn=92t present. > > Now we would like to ask you what is the plugin responsible to do this=20 > data load in ExternalDatabase and ExternalDatabaseRelease tables. > > Do you have any documentation about the plugins modifications or about=20 > new plugins usage similar as present in Boostrap data wiki? > > Thanks a lot again! > > Fabr=EDcio > |
From: Steve F. <sfi...@pc...> - 2005-07-13 02:00:58
|
Fabrico- Please look in $PROJECT_HOME/GUS/Supported/plugins/perl for the set of=20 supported plugins. I think you will find the answer there. If you have any trouble at all=20 write back. Steve Fabr=EDcio wrote: >Spam detection software, running on the system "recreio.de9.ime.eb.br", = has >identified this incoming email as possible spam. The original message >has been attached to this so you can view it (if it isn't spam) or block >similar future email. If you have any questions, see >the administrator of that system for details. > >Content preview: Hello all, We would like to thank all of you for GUS > Installation help. Now we have a 3.5 version installed ok! [...]=20 > >Content analysis details: (5.2 points, 5.0 required) > > pts rule name description >---- ---------------------- --------------------------------------------= ------ > 0.5 BR_RECEIVED_SPAMMER Received com endereco DSL ou Dial-Up de Spam= mers > 0.1 HTML_60_70 BODY: Message is 60% to 70% HTML > 0.0 HTML_MESSAGE BODY: HTML included in message > 1.1 RCVD_IN_SORBS_SOCKS RBL: SORBS: sender is open SOCKS proxy serve= r > [200.165.172.249 listed in dnsbl.sorbs.net] > 1.1 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy server > [200.165.172.249 listed in dnsbl.sorbs.net] > 1.1 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org > [<http://dsbl.org/listing?200.165.172.249>] > 1.1 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy > [200.165.172.249 listed in dnsbl.njabl.org] > 0.1 RCVD_IN_SORBS RBL: SORBS: sender is listed in SORBS > [200.165.172.249 listed in dnsbl.sorbs.net] > 0.1 RCVD_IN_NJABL RBL: Received via a relay in dnsbl.njabl.org > [200.165.172.249 listed in dnsbl.njabl.org] > >The original message was not completely plain text, and may be unsafe to >open with some email clients; in particular, it may contain a virus, >or confirm that your address can receive spam. If you wish to view >it, it may be safer to save it to a file and open it with an editor. > > =20 > > > -----------------------------------------------------------------------= - > > Subject: > New plugins > From: > Fabr=EDcio <fab...@de...> > Date: > Tue, 12 Jul 2005 18:35:19 -0300 > To: > <gus...@li...> > > To: > <gus...@li...> > > > Hello all, > > We would like to thank all of you for GUS Installation help. Now we=20 > have a 3.5 version installed ok! > > Our doubts now are about the Plugins. As we noticed some plugins to=20 > load data seem to be modified. For instance, we would like to load=20 > ExternalDatabase.xml and ExternalDatabaseRelease.xml files into the=20 > respective tables and we noticed that the UpdateGusFromXML Plugin=20 > demonstrated in the Boostrap data wiki isn=92t present. > > Now we would like to ask you what is the plugin responsible to do this=20 > data load in ExternalDatabase and ExternalDatabaseRelease tables. > > Do you have any documentation about the plugins modifications or about=20 > new plugins usage similar as present in Boostrap data wiki? > > Thanks a lot again! > > Fabr=EDcio > |
From: Michael S. <msa...@pc...> - 2005-07-13 00:22:50
|
Hi Sanjeev, Can you provide the following: Results of: $ which generateGusObjects $ echo $PROJECT_HOME $ echo $GUS_HOME $ echo $PATH $ echo $GUS_CONFIG_FILE (don't type the $ sign) --Mike On 7/12/05 7:08 PM, "Kumar, Sanjeev (Contr)" <San...@ng...> wrote: > > > -----Original Message----- > From: Kumar, Sanjeev (Contr) > Sent: Tuesday, July 12, 2005 7:08 PM > To: 'gus...@li...' > Subject: RE: gus 3.5 installation error > > > > Hi, > I am getting following error while doing the GUS3.5 installation. > It has created schema object successfully , but at the time of creation of > perl object it is throwing error message. > Can anyone help me ? > > Thanks > Sanjeev > > [oragus35@rdevse02 oragus35]$ build GUS install > -append > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask writeGusProp > [WritePropertyFiles] INFO: Skipping creation of > GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties -- > already exists > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask > writeInstallProp > [WritePropertyFiles] INFO: Recreating install.prop > file > [WritePropertyFiles] Jul 12, 2005 7:10:43 PM > org.gusdb.install.WritePropertyFileTask > writePluginProp > [WritePropertyFiles] INFO: Recreating > GUS-PluginMgr.prop file > [copy] Copying 1 file to > /home/oragus35/GUS/gus_home/config > [copy] Copying 1 file to > /home/oragus35/GUS/gus_home/config > [echo] . > [echo] Installing CBIL/Bio > [echo] . > [echo] Installing CBIL/CSP > [echo] . > [echo] Installing CBIL/Util > [echo] . > [echo] Installing CBIL/HQ > [echo] . > [echo] Installing CBIL/ObjectMapper > [concat] No existing files and no nested text, > doing nothing > [echo] . > [echo] Installing GUS/Supported > [echo] . > [echo] Installing GUS/Community > [echo] . > [echo] Installing GUS/DBAdmin > [echo] . > [echo] Installing GUS/GOPredict > [echo] . > [echo] Installing GUS/ObjRelP > [echo] generating Perl Objects > > BUILD FAILED > /home/oragus35/GUS/project_home/install/build.xml:28: > The following error occurred while executing this > line: > /home/oragus35/GUS/project_home/GUS/build.xml:225: > exec returned: -1 > > Total time: 7 seconds > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > > ------------------------------------------------------- > This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening > July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual > core and dual graphics technology at this free one hour event hosted by HP, > AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Kumar, S. \(Contr\) <San...@ng...> - 2005-07-12 23:09:07
|
-----Original Message----- From: Kumar, Sanjeev (Contr)=20 Sent: Tuesday, July 12, 2005 7:08 PM To: 'gus...@li...' Subject: RE: gus 3.5 installation error Hi, I am getting following error while doing the GUS3.5 installation. It has created schema object successfully , but at the time of = creation of perl object it is throwing error message. Can anyone help me ? Thanks Sanjeev [oragus35@rdevse02 oragus35]$ build GUS install -append=20 [WritePropertyFiles] Jul 12, 2005 7:10:43 PM org.gusdb.install.WritePropertyFileTask writeGusProp [WritePropertyFiles] INFO: Skipping creation of GUS_CONFIG_FILE /home/oragus35/GUS/gus.properties -- already exists [WritePropertyFiles] Jul 12, 2005 7:10:43 PM org.gusdb.install.WritePropertyFileTask writeInstallProp [WritePropertyFiles] INFO: Recreating install.prop file [WritePropertyFiles] Jul 12, 2005 7:10:43 PM org.gusdb.install.WritePropertyFileTask writePluginProp [WritePropertyFiles] INFO: Recreating GUS-PluginMgr.prop file [copy] Copying 1 file to /home/oragus35/GUS/gus_home/config [copy] Copying 1 file to /home/oragus35/GUS/gus_home/config [echo] . [echo] Installing CBIL/Bio [echo] . [echo] Installing CBIL/CSP [echo] . [echo] Installing CBIL/Util [echo] . [echo] Installing CBIL/HQ [echo] . [echo] Installing CBIL/ObjectMapper [concat] No existing files and no nested text, doing nothing [echo] . [echo] Installing GUS/Supported [echo] . [echo] Installing GUS/Community [echo] . [echo] Installing GUS/DBAdmin [echo] . [echo] Installing GUS/GOPredict [echo] . [echo] Installing GUS/ObjRelP [echo] generating Perl Objects =20 BUILD FAILED /home/oragus35/GUS/project_home/install/build.xml:28: The following error occurred while executing this line: /home/oragus35/GUS/project_home/GUS/build.xml:225: exec returned: -1 =20 Total time: 7 seconds __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around=20 http://mail.yahoo.com=20 |
From: <fab...@de...> - 2005-07-12 21:35:44
|
Spam detection software, running on the system "recreio.de9.ime.eb.br", h= as identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or block similar future email. If you have any questions, see the administrator of that system for details. Content preview: Hello all, We would like to thank all of you for GUS Installation help. Now we have a 3.5 version installed ok! [...]=20 Content analysis details: (5.2 points, 5.0 required) pts rule name description ---- ---------------------- ---------------------------------------------= ----- 0.5 BR_RECEIVED_SPAMMER Received com endereco DSL ou Dial-Up de Spamm= ers 0.1 HTML_60_70 BODY: Message is 60% to 70% HTML 0.0 HTML_MESSAGE BODY: HTML included in message 1.1 RCVD_IN_SORBS_SOCKS RBL: SORBS: sender is open SOCKS proxy server [200.165.172.249 listed in dnsbl.sorbs.net] 1.1 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy server [200.165.172.249 listed in dnsbl.sorbs.net] 1.1 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org [<http://dsbl.org/listing?200.165.172.249>] 1.1 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy [200.165.172.249 listed in dnsbl.njabl.org] 0.1 RCVD_IN_SORBS RBL: SORBS: sender is listed in SORBS [200.165.172.249 listed in dnsbl.sorbs.net] 0.1 RCVD_IN_NJABL RBL: Received via a relay in dnsbl.njabl.org [200.165.172.249 listed in dnsbl.njabl.org] The original message was not completely plain text, and may be unsafe to open with some email clients; in particular, it may contain a virus, or confirm that your address can receive spam. If you wish to view it, it may be safer to save it to a file and open it with an editor. |