From: <pi...@pc...> - 2004-11-08 19:45:13
|
Hi Ed, Having some difficult scheduling problems today but I (& Steve) will dedicate time to this tomorrow or later today. -Debbie Quoting Ed Robinson <ed_...@be...>: > The first step in writing such an adapter needs to be a document, though, > which shows what fields, in what formats go where in GUS. One of the main > problems with the parsers is that they have been developed without a common > document saying what kind of information goes where. > > To this end, I have a simple analysis of where our TCruzi and Crypto data are > being loaded by the different parsers. I am attaching copies of these two > brief documents in MS Word format.Presently this analysis is in Open Office > format. I would like to use these to start developing a data-destination > document that we can use as a standard for all further parser development. > > Also, I am not sure that this solution is really necessary for GFF Format. > Writing a GFF adapter involves two steps 1. Querying the data and 2. Passing > it correctly to BioPerl. The solution we have so far is simply to put the > formating information in the SQL query (it's one step). Of course this is a > solution that is ignorant of the GUS object model. It would be nice to embedd > this process in an object which maps from GUS objects to BioPERL for a number > of reasons. But I also think it might be something to put off until later > since the formatted SQL query is a quick-and-dirty time saver. > > -Ed > > > > > > > > From: Steve Fischer <sfi...@pc...> > > Date: 2004/11/05 Fri AM 11:25:32 EST > > To: gusdev-gusdev <gus...@li...>, > > "Aaron J. Mackey" <am...@pc...> > > Subject: [Gusdev-gusdev] GUS & bioperl > > > > folks- > > > > We should immediately explore a GUS <--> bioperl adaptor. > > > > we would use it for: > > - Genbank and TIGR XML -> GUS > > - GUS -> GBrowse > > - possibly GUS -> Chado > > > > Here is what Aaron has to say about parsing Genbank, etc: > > > > Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and > > represents all of it (split between Bio::Seq [sequence, id, accession, > > etc], Bio::SeqFeature [everything found in the feature table] and > > Bio::Annotation [comments, references, etc] objects). Similar parsers > > exist for practically all common sequence formats (including TIGR-XML > > and other genome annotation-relevant formats). > > > > Here is what Aaron has to say about GBrowse: > > > > IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools > > for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and > > Bio::Feature::IO in the latest development release (due out any day > > now, as soon as I stop reading my email; for now, you can get it from > > CVS). > > > > To use these tools, you build Bio::SeqFeature objects that represent > > the items you wish to dump as GFF; thus you can build complicated > > hierarchies of gene models, exons, CDS, UTR, etc, adding deeply > > structured attributes/annotations to each, and let the BioPerl GFF code > > figure out how to represent it (in GFF2 or GFF3) so that other tools > > (including Gbrowse) can read it. > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: > > Sybase ASE Linux Express Edition - download now for FREE > > LinuxWorld Reader's Choice Award Winner for best database on Linux. > > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > > _______________________________________________ > > Gusdev-gusdev mailing list > > Gus...@li... > > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > Ed Robinson > 255 Deerfield Rd > Bogart, GA 30622 > (706)425-9181 > Sweet Caroline > > good times never seemed so good. > I've been inclined > to believe they never would. > --Neil Diamond > > > We're just a bunch of idiots. > --Johnny Damon > |