Re: [Gusdev-gusdev] GUS & bioperl

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 5 Nov 2004, at 17:40, Steve Fischer wrote:

>  about gbrowse.=A0=A0 it is good that Haiming has put together a =
prototype=20
> for loading gus data into gbrowse.
>
>  but, as Aaron points out, we will likely be putting sophisticated=20
> data into gbrowse.=A0=A0 i would rather start on a strong foundation =
than=20
> invest resources into a solution that we will grow out of.=A0=A0
>
>  it should not be hard to put gus data into bioperl.
>

We would welcome GUS to Bioperl software :)

Ed has a point: mapping GUS objects to bioperl objects and back again=20
needs some thought. We put Bioperl data into GUS and we've=20
retrospectively documented what goes where so all developers understand=20=

how things work. It also highlights anything that is "missing".

I hope GFF output has improved in the latest CVS version of Bioperl,=20
the stable 1.4 version was not up to scratch for me so I just wrote my=20=

own :(

>  steve
>
>  Ed Robinson wrote:
>
> The first step in writing such an adapter needs to be a document,=20
> though, which shows what fields, in what formats go where in GUS.  One=20=

> of the main problems with the parsers is that they have been developed=20=

> without a common document saying what kind of information goes where.
>
> To this end, I have a simple analysis of where our TCruzi and Crypto=20=

> data are being loaded by the different parsers.  I am attaching copies=20=

> of these two brief documents in MS Word format.Presently this analysis=20=

> is in Open Office format.  I would like to use these to start=20
> developing a data-destination document that we can use as a standard=20=

> for all further parser development.
>
> Also, I am not sure that this solution is really necessary for GFF=20
> Format.  Writing a GFF adapter involves two steps 1. Querying the data=20=

> and 2. Passing it correctly to BioPerl.  The solution we have so far=20=

> is simply to put the formating information in the SQL query (it's one=20=

> step).  Of course this is a solution that is ignorant of the GUS=20
> object model. It would be nice to embedd this process in an object=20
> which maps from GUS objects to BioPERL for a number of reasons.  But I=20=

> also think it might be something to put off until later since the=20
> formatted SQL query is a quick-and-dirty time saver.
>
> -Ed
>
>
>
>
>
> From: Steve Fischer <sfi...@pc...>
> Date: 2004/11/05 Fri AM 11:25:32 EST
> To: gusdev-gusdev <gus...@li...>,
>         "Aaron J. Mackey" <am...@pc...>
> Subject: [Gusdev-gusdev] GUS & bioperl
>
> folks-
>
> We should immediately explore a GUS <--> bioperl adaptor.
>
> we would use it for:
>    - Genbank and TIGR XML -> GUS
>    - GUS -> GBrowse
>    - possibly GUS -> Chado
>
> Here is what Aaron has to say about parsing Genbank, etc:
>
> Bio::SeqIO::GenBank is the BioPerl parser for GenBank; it parses and
> represents all of it (split between Bio::Seq [sequence, id, accession,
> etc], Bio::SeqFeature [everything found in the feature table] and
> Bio::Annotation [comments, references, etc] objects).  Similar parsers
> exist for practically all common sequence formats (including TIGR-XML
> and other genome annotation-relevant formats).
>
> Here is what Aaron has to say about GBrowse:
>
> IMO, the "best" way to generate (valid) GFF is to use BioPerl's tools
> for GFF manipulation: Bio::Tools::GFF in older BioPerl's, and
> Bio::Feature::IO in the latest development release (due out any day
> now, as soon as I stop reading my email; for now, you can get it from
> CVS).
>
> To use these tools, you build Bio::SeqFeature objects that represent
> the items you wish to dump as GFF; thus you can build complicated
> hierarchies of gene models, exons, CDS, UTR, etc, adding deeply
> structured attributes/annotations to each, and let the BioPerl GFF =
code
> figure out how to represent it (in GFF2 or GFF3) so that other tools
> (including Gbrowse) can read it.
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Sybase ASE Linux Express Edition - download now for FREE
> LinuxWorld Reader's Choice Award Winner for best database on Linux.
> http://ads.osdn.com/?ad_id=3D5588&alloc_id=3D12065&op=3Dclick
> _______________________________________________
> Gusdev-gusdev mailing list
> Gus...@li...
> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev
>
>
> Ed Robinson
> 255 Deerfield Rd
> Bogart, GA 30622
> (706)425-9181
> Sweet Caroline
>
> good times never seemed so good.
> I've been inclined
> to believe they never would.
>      --Neil Diamond
>
>
> We're just a bunch of idiots.
>       --Johnny Damon