Thread: [Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

Brought to you by: daveclements, girlwithglasses, gk_fan, hueyling, and 10 others

gmod-resources

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Don G. <gil...@bi...> - 2005-03-07 20:54:16

Todd, others,

Sounds like there is general support.  For specifics, my own quick thoughts are it
would be easier, maybe better, to start with one top level folder, and put the 
options off of that, with some "MUST" options, and many "MAY" options.  I would probably
prefer this sort of organization:

  my.org/genome{s}/    
    -- return information/help page with links, MUST offer
   { plurals are tricky - do people remember them? }
  my.org/genome/dna  == return full genome dna in fasta format  
         genome/genome == alias to /dna
  my.org/genome/protein{s} == return " proteins in fasta
         genome/proteome  == alias to proteins
  my.org/genome/transcript{s} ==  transcripts in fasta
	 genome/transcriptome  == alias to transcripts 
  my.org/genome/feature{s} ==  features in GFF
         genome/gff  == alias to features

   -- other options as people want to support them
       genome/versions/ ...
       genome/species/ ..
   -- and/or use CGI options for these "?species=x&version=y"

But we should decide what good common terms above and option names are.

-- Don

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Don G. <gil...@bi...> - 2005-03-16 07:39:44

Here is a flybase example
http://preview.flybase.net/genome/

How close is this to what other MODs think should be a common
genome URL?

- Don
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- gil...@in...--http://marmot.bio.indiana.edu/

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Kara D. <ka...@ge...> - 2005-03-16 15:14:44

Works for me!

Is the idea we'd have a page like Don's 
(http://preview.flybase.net/genome/) that would serve as an 
index/description page, then have subdirectories that directly go to 
the data file of interest?

For eg:
genome/dna/
directly to download page for chromosomes in fasta format

genome/feature/
(and/or genome/gff/)
directly download GFF

genome/protein/
directly download protein sequences in fasta format

genome/gene/
(and/or genome/transcript/
and/or genome/orf/)
directly download DNA sequences of all genes in fasta format

If this works for everyone, let's implement!

-Kara

On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:

> Here is a flybase example
> http://preview.flybase.net/genome/
>
> How close is this to what other MODs think should be a common
> genome URL?
>
> - Don
> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> -- gil...@in...--http://marmot.bio.indiana.edu/

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-22 20:45:09

Looks great, but I think we do need to have the species in the URL; also we've 
started keeping track of non-coding RNAs, and I think everyone will want to 
soon.  So how about the following elaborations?

 http://your.site/genome/
  leads to index page for species

 http://your.site/genome/Binomial_name/
  leads to index for releases for species
  Binomial_name

 http://your.site/genome/Binomial_name/release_name/
  leads to index for named release

 http://your.site/genome/Binomial_name/current/
  leads to index for current release

 http://your.site/genome/Binomial_name/current/dna
  leads directly to FASTA file containing big DNA fragments
  (e.g. chromosomes)

  http://your.site/genome/Binomial_name/current/mrna
    leads directly to FASTA file containing spliced
  mRNA transcript sequences
  
 http://your.site/genome/Binomial_name/current/ncrna
      leads directly to FASTA file containing non-coding
    RNA sequences

 http://your.site/genome/Binomial_name/current/protein
  leads directly to FASTA file for protein downloads

 http://your.site/genome/Binomial_name/current/feature
  leads directly to GFF3 file for feature downloads

Lincoln


On Wednesday 16 March 2005 10:14 am, Kara Dolinski wrote:
> Works for me!
>
> Is the idea we'd have a page like Don's
> (http://preview.flybase.net/genome/) that would serve as an
> index/description page, then have subdirectories that directly go to
> the data file of interest?
>
> For eg:
> genome/dna/
> directly to download page for chromosomes in fasta format
>
> genome/feature/
> (and/or genome/gff/)
> directly download GFF
>
> genome/protein/
> directly download protein sequences in fasta format
>
> genome/gene/
> (and/or genome/transcript/
> and/or genome/orf/)
> directly download DNA sequences of all genes in fasta format
>
> If this works for everyone, let's implement!
>
> -Kara
>
> On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:
> > Here is a flybase example
> > http://preview.flybase.net/genome/
> >
> > How close is this to what other MODs think should be a common
> > genome URL?
> >
> > - Don
> > -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> > -- gil...@in...--http://marmot.bio.indiana.edu/
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Gmod-devel mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-devel

-- 
Lincoln Stein
ls...@cs...
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Kara D. <ka...@ge...> - 2005-03-22 21:05:54

These elaborations look great to me.
Unless there are objections expressed in the next couple of days, I'll 
put this implementation on the SGD to-do list.

On Mar 22, 2005, at 3:44 PM, Lincoln Stein wrote:

> Looks great, but I think we do need to have the species in the URL; 
> also we've
> started keeping track of non-coding RNAs, and I think everyone will 
> want to
> soon.  So how about the following elaborations?
>
>  http://your.site/genome/
>   leads to index page for species
>
>  http://your.site/genome/Binomial_name/
>   leads to index for releases for species
>   Binomial_name
>
>  http://your.site/genome/Binomial_name/release_name/
>   leads to index for named release
>
>  http://your.site/genome/Binomial_name/current/
>   leads to index for current release
>
>  http://your.site/genome/Binomial_name/current/dna
>   leads directly to FASTA file containing big DNA fragments
>   (e.g. chromosomes)
>
>   http://your.site/genome/Binomial_name/current/mrna
>     leads directly to FASTA file containing spliced
>   mRNA transcript sequences
>
>  http://your.site/genome/Binomial_name/current/ncrna
>       leads directly to FASTA file containing non-coding
>     RNA sequences
>
>  http://your.site/genome/Binomial_name/current/protein
>   leads directly to FASTA file for protein downloads
>
>  http://your.site/genome/Binomial_name/current/feature
>   leads directly to GFF3 file for feature downloads
>
> Lincoln
>
>
> On Wednesday 16 March 2005 10:14 am, Kara Dolinski wrote:
>> Works for me!
>>
>> Is the idea we'd have a page like Don's
>> (http://preview.flybase.net/genome/) that would serve as an
>> index/description page, then have subdirectories that directly go to
>> the data file of interest?
>>
>> For eg:
>> genome/dna/
>> directly to download page for chromosomes in fasta format
>>
>> genome/feature/
>> (and/or genome/gff/)
>> directly download GFF
>>
>> genome/protein/
>> directly download protein sequences in fasta format
>>
>> genome/gene/
>> (and/or genome/transcript/
>> and/or genome/orf/)
>> directly download DNA sequences of all genes in fasta format
>>
>> If this works for everyone, let's implement!
>>
>> -Kara
>>
>> On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:
>>> Here is a flybase example
>>> http://preview.flybase.net/genome/
>>>
>>> How close is this to what other MODs think should be a common
>>> genome URL?
>>>
>>> - Don
>>> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
>>> -- gil...@in...--http://marmot.bio.indiana.edu/
>>
>> -------------------------------------------------------
>> SF email is sponsored by - The IT Product Guide
>> Read honest & candid reviews on hundreds of IT Products from real 
>> users.
>> Discover which products truly live up to the hype. Start reading now.
>> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
>> _______________________________________________
>> Gmod-devel mailing list
>> Gmo...@li...
>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>
> -- 
> Lincoln Stein
> ls...@cs...
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Aaron J. M. <am...@pc...> - 2005-03-22 21:13:25

Looks great, count us (ApiDB.org) in too ...

Do the index pages have a specified (i.e. parsable) format?  Perhaps 
/format/ can be a generic trailing argument for any url to specify 
/xml/ or /fasta/ or /genbank/, etc?

-Aaron

On Mar 22, 2005, at 3:44 PM, Lincoln Stein wrote:

> Looks great, but I think we do need to have the species in the URL; 
> also we've
> started keeping track of non-coding RNAs, and I think everyone will 
> want to
> soon.  So how about the following elaborations?
>
>  http://your.site/genome/
>   leads to index page for species
>
>  http://your.site/genome/Binomial_name/
>   leads to index for releases for species
>   Binomial_name
>
>  http://your.site/genome/Binomial_name/release_name/
>   leads to index for named release
>
>  http://your.site/genome/Binomial_name/current/
>   leads to index for current release
>
>  http://your.site/genome/Binomial_name/current/dna
>   leads directly to FASTA file containing big DNA fragments
>   (e.g. chromosomes)
>
>   http://your.site/genome/Binomial_name/current/mrna
>     leads directly to FASTA file containing spliced
>   mRNA transcript sequences
>
>  http://your.site/genome/Binomial_name/current/ncrna
>       leads directly to FASTA file containing non-coding
>     RNA sequences
>
>  http://your.site/genome/Binomial_name/current/protein
>   leads directly to FASTA file for protein downloads
>
>  http://your.site/genome/Binomial_name/current/feature
>   leads directly to GFF3 file for feature downloads
>
> Lincoln
>
>
> On Wednesday 16 March 2005 10:14 am, Kara Dolinski wrote:
>> Works for me!
>>
>> Is the idea we'd have a page like Don's
>> (http://preview.flybase.net/genome/) that would serve as an
>> index/description page, then have subdirectories that directly go to
>> the data file of interest?
>>
>> For eg:
>> genome/dna/
>> directly to download page for chromosomes in fasta format
>>
>> genome/feature/
>> (and/or genome/gff/)
>> directly download GFF
>>
>> genome/protein/
>> directly download protein sequences in fasta format
>>
>> genome/gene/
>> (and/or genome/transcript/
>> and/or genome/orf/)
>> directly download DNA sequences of all genes in fasta format
>>
>> If this works for everyone, let's implement!
>>
>> -Kara
>>
>> On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:
>>> Here is a flybase example
>>> http://preview.flybase.net/genome/
>>>
>>> How close is this to what other MODs think should be a common
>>> genome URL?
>>>
>>> - Don
>>> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
>>> -- gil...@in...--http://marmot.bio.indiana.edu/
>>
>> -------------------------------------------------------
>> SF email is sponsored by - The IT Product Guide
>> Read honest & candid reviews on hundreds of IT Products from real 
>> users.
>> Discover which products truly live up to the hype. Start reading now.
>> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
>> _______________________________________________
>> Gmod-devel mailing list
>> Gmo...@li...
>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>
> -- 
> Lincoln Stein
> ls...@cs...
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
>
>
--
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania       email:  am...@pc...
415 S. University Avenue         office: 215-898-1205
Philadelphia, PA  19104-6017     fax:    215-746-6697

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-23 00:22:15

I like the idea of elaborators, but let's keep it simple for the time being.  
I want people to implement this, and I know that if it gets much more 
complicated, I'll be one of the laggards.

Another thing to think about is whether we can use RSS format as an 
alternative to the index pages. This will let people receive updates on their 
iPods when new genomic data is available.

Lincoln

On Tuesday 22 March 2005 04:13 pm, Aaron J. Mackey wrote:
> Looks great, count us (ApiDB.org) in too ...
>
> Do the index pages have a specified (i.e. parsable) format?  Perhaps
> /format/ can be a generic trailing argument for any url to specify
> /xml/ or /fasta/ or /genbank/, etc?
>
> -Aaron
>
> On Mar 22, 2005, at 3:44 PM, Lincoln Stein wrote:
> > Looks great, but I think we do need to have the species in the URL;
> > also we've
> > started keeping track of non-coding RNAs, and I think everyone will
> > want to
> > soon.  So how about the following elaborations?
> >
> >  http://your.site/genome/
> >   leads to index page for species
> >
> >  http://your.site/genome/Binomial_name/
> >   leads to index for releases for species
> >   Binomial_name
> >
> >  http://your.site/genome/Binomial_name/release_name/
> >   leads to index for named release
> >
> >  http://your.site/genome/Binomial_name/current/
> >   leads to index for current release
> >
> >  http://your.site/genome/Binomial_name/current/dna
> >   leads directly to FASTA file containing big DNA fragments
> >   (e.g. chromosomes)
> >
> >   http://your.site/genome/Binomial_name/current/mrna
> >     leads directly to FASTA file containing spliced
> >   mRNA transcript sequences
> >
> >  http://your.site/genome/Binomial_name/current/ncrna
> >       leads directly to FASTA file containing non-coding
> >     RNA sequences
> >
> >  http://your.site/genome/Binomial_name/current/protein
> >   leads directly to FASTA file for protein downloads
> >
> >  http://your.site/genome/Binomial_name/current/feature
> >   leads directly to GFF3 file for feature downloads
> >
> > Lincoln
> >
> > On Wednesday 16 March 2005 10:14 am, Kara Dolinski wrote:
> >> Works for me!
> >>
> >> Is the idea we'd have a page like Don's
> >> (http://preview.flybase.net/genome/) that would serve as an
> >> index/description page, then have subdirectories that directly go to
> >> the data file of interest?
> >>
> >> For eg:
> >> genome/dna/
> >> directly to download page for chromosomes in fasta format
> >>
> >> genome/feature/
> >> (and/or genome/gff/)
> >> directly download GFF
> >>
> >> genome/protein/
> >> directly download protein sequences in fasta format
> >>
> >> genome/gene/
> >> (and/or genome/transcript/
> >> and/or genome/orf/)
> >> directly download DNA sequences of all genes in fasta format
> >>
> >> If this works for everyone, let's implement!
> >>
> >> -Kara
> >>
> >> On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:
> >>> Here is a flybase example
> >>> http://preview.flybase.net/genome/
> >>>
> >>> How close is this to what other MODs think should be a common
> >>> genome URL?
> >>>
> >>> - Don
> >>> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> >>> -- gil...@in...--http://marmot.bio.indiana.edu/
> >>
> >> -------------------------------------------------------
> >> SF email is sponsored by - The IT Product Guide
> >> Read honest & candid reviews on hundreds of IT Products from real
> >> users.
> >> Discover which products truly live up to the hype. Start reading now.
> >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> >> _______________________________________________
> >> Gmod-devel mailing list
> >> Gmo...@li...
> >> https://lists.sourceforge.net/lists/listinfo/gmod-devel
> >
> > --
> > Lincoln Stein
> > ls...@cs...
> > Cold Spring Harbor Laboratory
> > 1 Bungtown Road
> > Cold Spring Harbor, NY 11724
> > (516) 367-8380 (voice)
> > (516) 367-8389 (fax)
>
> --
> Aaron J. Mackey, Ph.D.
> Dept. of Biology, Goddard 212
> University of Pennsylvania       email:  am...@pc...
> 415 S. University Avenue         office: 215-898-1205
> Philadelphia, PA  19104-6017     fax:    215-746-6697

-- 
Lincoln Stein
ls...@cs...
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Aaron J. M. <am...@pc...> - 2005-03-23 13:35:15

I agree completely.

But I thought that "compliance" with any of this spec was purely 
optional; that is, nobody guarantees to support all 
datatypes/indices/formats (punishable by ... ?)  Thus, I was merely 
trying to flesh out the likely areas of feature creep.  Any unsupported 
URLs should probably return 404 Not Found.

And yes, RSS for the index is exactly what I had in mind.

For ApiDB-umbrella databases, Don's default genome/dna url may be 
difficult to do right (e.g. for PlasmoDB, there are 5 species; you 
might argue falciparum is the "default", but in other cases this might 
not be so clear).  I advocate that the "spec" should specify the 
binomial name (as defined by the NCBI Taxonomy scientific name, e.g. 
Giardia intestinalis, not Giardia lamblia or Giardia duodenalis).

Talk about feature creep, if /xml/ was a valid elaborator for features, 
you could then imagine further appended XPath queries to filter the 
dataset, leading to a technologically cheap queryable data warehouse 
(given an agreeable DTD, presumably ChadoXML ??)

-Aaron

On Mar 22, 2005, at 7:21 PM, Lincoln Stein wrote:

> I like the idea of elaborators, but let's keep it simple for the time 
> being.
> I want people to implement this, and I know that if it gets much more
> complicated, I'll be one of the laggards.
>
> Another thing to think about is whether we can use RSS format as an
> alternative to the index pages. This will let people receive updates 
> on their
> iPods when new genomic data is available.
>
> Lincoln
>
> On Tuesday 22 March 2005 04:13 pm, Aaron J. Mackey wrote:
>> Looks great, count us (ApiDB.org) in too ...
>>
>> Do the index pages have a specified (i.e. parsable) format?  Perhaps
>> /format/ can be a generic trailing argument for any url to specify
>> /xml/ or /fasta/ or /genbank/, etc?
>>
>> -Aaron
>>
>> On Mar 22, 2005, at 3:44 PM, Lincoln Stein wrote:
>>> Looks great, but I think we do need to have the species in the URL;
>>> also we've
>>> started keeping track of non-coding RNAs, and I think everyone will
>>> want to
>>> soon.  So how about the following elaborations?
>>>
>>>  http://your.site/genome/
>>>   leads to index page for species
>>>
>>>  http://your.site/genome/Binomial_name/
>>>   leads to index for releases for species
>>>   Binomial_name
>>>
>>>  http://your.site/genome/Binomial_name/release_name/
>>>   leads to index for named release
>>>
>>>  http://your.site/genome/Binomial_name/current/
>>>   leads to index for current release
>>>
>>>  http://your.site/genome/Binomial_name/current/dna
>>>   leads directly to FASTA file containing big DNA fragments
>>>   (e.g. chromosomes)
>>>
>>>   http://your.site/genome/Binomial_name/current/mrna
>>>     leads directly to FASTA file containing spliced
>>>   mRNA transcript sequences
>>>
>>>  http://your.site/genome/Binomial_name/current/ncrna
>>>       leads directly to FASTA file containing non-coding
>>>     RNA sequences
>>>
>>>  http://your.site/genome/Binomial_name/current/protein
>>>   leads directly to FASTA file for protein downloads
>>>
>>>  http://your.site/genome/Binomial_name/current/feature
>>>   leads directly to GFF3 file for feature downloads
>>>
>>> Lincoln
>>>
>>> On Wednesday 16 March 2005 10:14 am, Kara Dolinski wrote:
>>>> Works for me!
>>>>
>>>> Is the idea we'd have a page like Don's
>>>> (http://preview.flybase.net/genome/) that would serve as an
>>>> index/description page, then have subdirectories that directly go to
>>>> the data file of interest?
>>>>
>>>> For eg:
>>>> genome/dna/
>>>> directly to download page for chromosomes in fasta format
>>>>
>>>> genome/feature/
>>>> (and/or genome/gff/)
>>>> directly download GFF
>>>>
>>>> genome/protein/
>>>> directly download protein sequences in fasta format
>>>>
>>>> genome/gene/
>>>> (and/or genome/transcript/
>>>> and/or genome/orf/)
>>>> directly download DNA sequences of all genes in fasta format
>>>>
>>>> If this works for everyone, let's implement!
>>>>
>>>> -Kara
>>>>
>>>> On Mar 16, 2005, at 2:39 AM, Don Gilbert wrote:
>>>>> Here is a flybase example
>>>>> http://preview.flybase.net/genome/
>>>>>
>>>>> How close is this to what other MODs think should be a common
>>>>> genome URL?
>>>>>
>>>>> - Don
>>>>> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
>>>>> -- gil...@in...--http://marmot.bio.indiana.edu/
>>>>
>>>> -------------------------------------------------------
>>>> SF email is sponsored by - The IT Product Guide
>>>> Read honest & candid reviews on hundreds of IT Products from real
>>>> users.
>>>> Discover which products truly live up to the hype. Start reading 
>>>> now.
>>>> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
>>>> _______________________________________________
>>>> Gmod-devel mailing list
>>>> Gmo...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>>>
>>> --
>>> Lincoln Stein
>>> ls...@cs...
>>> Cold Spring Harbor Laboratory
>>> 1 Bungtown Road
>>> Cold Spring Harbor, NY 11724
>>> (516) 367-8380 (voice)
>>> (516) 367-8389 (fax)
>>
>> --
>> Aaron J. Mackey, Ph.D.
>> Dept. of Biology, Goddard 212
>> University of Pennsylvania       email:  am...@pc...
>> 415 S. University Avenue         office: 215-898-1205
>> Philadelphia, PA  19104-6017     fax:    215-746-6697
>
> -- 
> Lincoln Stein
> ls...@cs...
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: 2005 Windows Mobile Application 
> Contest
> Submit applications for Windows Mobile(tm)-based Pocket PCs or 
> Smartphones
> for the chance to win $25,000 and application distribution. Enter 
> today at
> http://ads.osdn.com/?ad_id=6882&alloc_id=15148&op=click
> _______________________________________________
> Gmod-devel mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>
--
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania       email:  am...@pc...
415 S. University Avenue         office: 215-898-1205
Philadelphia, PA  19104-6017     fax:    215-746-6697

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Scott C. <ca...@cs...> - 2005-03-23 13:50:10

On Tue, 2005-03-22 at 19:21 -0500, Lincoln Stein wrote:
> This will let people receive updates on their 
> iPods when new genomic data is available.
> 
Excellent Idea :-)
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         ca...@cs...
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-23 15:19:38

Attachments: standard_urls.txt

Ok, how does this look as the basis for a document that we can post at 
the Sequence Ontology and/or GMOD web sites?

Lincoln

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <mic...@cs...> on
all emails regarding scheduling and other time-critical topics.

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Kara D. <ka...@ge...> - 2005-03-23 16:04:19

Looks great to me, and I like the B_name idea (vs. Binomial_name).  
Less tricky typing is good.

One suggestion on the doc, from a previous point on this thread:  we 
might want to explicitly have a note for MODs that do not yet provide 
all file types, something like:

If your MOD does not yet provide a particular file type in this 
specification, then the URL for the unsupported file type should return 
a 404 Not Found
(or text that states "This file is not yet available for this MOD" or 
whatever behavior people think is most useful)....

I think this will be a common question that will come up.

On Mar 23, 2005, at 10:19 AM, Lincoln Stein wrote:

> Ok, how does this look as the basis for a document that we can post at
> the Sequence Ontology and/or GMOD web sites?
>
> Lincoln
>
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
>
> NOTE: Please copy Sandra Michelsen <mic...@cs...> on
> all emails regarding scheduling and other time-critical topics.
> <standard_urls.txt>

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Aaron J. M. <am...@pc...> - 2005-03-23 16:13:00

I like B_name too, but I'd suggest it might be a synonym for the 
Binomial_name; NCBI Taxonomy IDs might also be valid synonyms.

Also, we're going to run into subspecies/isolate issues, e.g. 
Plasmodium_falciparum_3D7 vs. Plasmodium_falciparum_DD2.  In this case, 
I'd think that

    http://your.site/genome/P_falciparum/releases/

might provide the index/RSS for all isolates, pointing to fully-named 
directories:

    http://your.site/genome/P_falciparum_3D7/v4.4/dna

And perhaps 501 Not Implemented is better than 404 Not Found?

Todd, are you planning on implementing these handlers as part of the 
Bio::GMOD toolkit?

-Aaron

On Mar 23, 2005, at 11:02 AM, Kara Dolinski wrote:

> Looks great to me, and I like the B_name idea (vs. Binomial_name).  
> Less tricky typing is good.
>
> One suggestion on the doc, from a previous point on this thread:  we 
> might want to explicitly have a note for MODs that do not yet provide 
> all file types, something like:
>
> If your MOD does not yet provide a particular file type in this 
> specification, then the URL for the unsupported file type should 
> return a 404 Not Found
> (or text that states "This file is not yet available for this MOD" or 
> whatever behavior people think is most useful)....
>
> I think this will be a common question that will come up.
>
> On Mar 23, 2005, at 10:19 AM, Lincoln Stein wrote:
>
>> Ok, how does this look as the basis for a document that we can post at
>> the Sequence Ontology and/or GMOD web sites?
>>
>> Lincoln
>>
>> -- 
>> Lincoln D. Stein
>> Cold Spring Harbor Laboratory
>> 1 Bungtown Road
>> Cold Spring Harbor, NY 11724
>>
>> NOTE: Please copy Sandra Michelsen <mic...@cs...> on
>> all emails regarding scheduling and other time-critical topics.
>> <standard_urls.txt>
>
>
--
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania       email:  am...@pc...
415 S. University Avenue         office: 215-898-1205
Philadelphia, PA  19104-6017     fax:    215-746-6697

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Adrian R. T. <ar...@sa...> - 2005-03-23 19:46:41

Lincoln Stein wrote:

>Ok, how does this look as the basis for a document that we can post at 
>the Sequence Ontology and/or GMOD web sites?
>
>Lincoln
>  
>
Hi,

We'd be happy to implement this for GeneDB. A few comments:

a) I'd agree with Aaron J. Mackey that we need to be able to distinguish 
between different strains/isolates.

b) Presumably binomial name extends to trinomial name as necessary ie we 
need to distinguish Trypanosoma_brucei_gambiense from our other 
trypanosoma species.

c) I'll admit to no detailed knowledge of writing RSS but it sounds a 
confusing world (The myth of RSS compatibility, 
http://diveintomark.org/archives/2004/02/04/incompatible-rss)! Is it 
worth specifying a version sooner rather than later, or are the 
differences not great in practice?

Adrian

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-04-05 16:49:42

> b) Presumably binomial name extends to trinomial name as necessary
> ie we need to distinguish Trypanosoma_brucei_gambiense from our
> other trypanosoma species.

Sure, let's extend the convention to subspecies, strains and isolates. =20
I still like shortening the genus name.  What do you think?

	T_brucei_gambiense_gambiense_isolate0392

> c) I'll admit to no detailed knowledge of writing RSS but it sounds
> a confusing world (The myth of RSS compatibility,
> http://diveintomark.org/archives/2004/02/04/incompatible-rss)! Is
> it worth specifying a version sooner rather than later, or are the
> differences not great in practice?

RSS version 1 is easier to deal with and seems to be more broadly=20
supported.

Lincoln


=2D-=20
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <mic...@cs...> on
all emails regarding scheduling and other time-critical topics.

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Adrian R. T. <ar...@sa...> - 2005-04-11 13:40:04

Lincoln Stein wrote:

>>b) Presumably binomial name extends to trinomial name as necessary
>>ie we need to distinguish Trypanosoma_brucei_gambiense from our
>>other trypanosoma species.
>>    
>>
>
>Sure, let's extend the convention to subspecies, strains and isolates.  
>I still like shortening the genus name.  What do you think?
>
>	T_brucei_gambiense_gambiense_isolate0392
>
>  
>
Sorry, I've been on holiday. Yep, sounds fine to us.

Adrian

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Scott C. <ca...@cs...> - 2005-04-11 14:36:26

Hi All,

I'm glad Adrian sent this because it reminded me: would anybody be
willing to summarize what has been agreed to and what outstanding issues
there are with this for the GMOD meeting next month?

Thanks,
Scott


On Mon, 2005-04-11 at 14:39 +0100, Adrian Roy Tivey wrote:
> Lincoln Stein wrote:
> 
> >>b) Presumably binomial name extends to trinomial name as necessary
> >>ie we need to distinguish Trypanosoma_brucei_gambiense from our
> >>other trypanosoma species.
> >>    
> >>
> >
> >Sure, let's extend the convention to subspecies, strains and isolates.  
> >I still like shortening the genus name.  What do you think?
> >
> >	T_brucei_gambiense_gambiense_isolate0392
> >
> >  
> >
> Sorry, I've been on holiday. Yep, sounds fine to us.
> 
> Adrian
> 
> 
> 
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Gmod-devel mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-devel
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         ca...@cs...
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Don G. <gil...@bi...> - 2005-03-23 02:27:34

> >  http://your.site/genome/Binomial_name/current/dna
> >   leads directly to FASTA file

To be clear, I take this to mean it should always be a http
stream of FastA data, not a redirect to ftp:// file or other
http:path, for programmability.  Likewise for other
endpoints. I assume gzip compression is allowed when
browser/client accepts that.

This sounds good to me. I'd also like to ask if we should
also support the simpler case which will work for many
people who want the primary species, most current data:

http://your.site/genome/dna

Not as critical if everyone supports a /genome/ index page.

- Don

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-23 15:00:35

Hi,

Actually, I don't see any reason that it can't be a redirect to an=20
ftp:// URL.  Most Java and Perl client libraries that I'm aware of=20
will automatically follow the redirects.

Lincoln

On Tuesday 22 March 2005 09:27 pm, Don Gilbert wrote:
> > >  http://your.site/genome/Binomial_name/current/dna
> > >   leads directly to FASTA file
>
> To be clear, I take this to mean it should always be a http
> stream of FastA data, not a redirect to ftp:// file or other
> http:path, for programmability.  Likewise for other
> endpoints. I assume gzip compression is allowed when
> browser/client accepts that.
>
> This sounds good to me. I'd also like to ask if we should
> also support the simpler case which will work for many
> people who want the primary species, most current data:
>
> http://your.site/genome/dna
>
> Not as critical if everyone supports a /genome/ index page.
>
> - Don

=2D-=20
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <mic...@cs...> on
all emails regarding scheduling and other time-critical topics.

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-23 15:04:31

One last question before we "go gold" on the first phase of this=20
scheme.  What do people think about replacing full Binomial_name with=20
B_name, e.g.:

	D_melanogaster
	G_intestinalis

I bring this up only because my favorite organism, Caenorhabditis, is=20
hard to spell!

Lincoln

On Tuesday 22 March 2005 09:27 pm, Don Gilbert wrote:
> > >  http://your.site/genome/Binomial_name/current/dna
> > >   leads directly to FASTA file
>
> To be clear, I take this to mean it should always be a http
> stream of FastA data, not a redirect to ftp:// file or other
> http:path, for programmability.  Likewise for other
> endpoints. I assume gzip compression is allowed when
> browser/client accepts that.
>
> This sounds good to me. I'd also like to ask if we should
> also support the simpler case which will work for many
> people who want the primary species, most current data:
>
> http://your.site/genome/dna
>
> Not as critical if everyone supports a /genome/ index page.
>
> - Don

=2D-=20
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <mic...@cs...> on
all emails regarding scheduling and other time-critical topics.

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Aaron J. M. <am...@pc...> - 2005-03-07 21:34:02

in your example URL's, "my.org/genome(s)/..."

Is this just to denote the special "data dump" section of the website? 
And if so, isn't there a more specific word to be used (e.g. "data" or 
"access" or "download" or such)?

Also, I think the whole must/may criteria is moot.  Everything is "may", 
there's nothing I "must" do ;)

some additional paths to possibly provide:

/help    # Don's suggested default "index" page
/version # Todd's suggestion of current release version
/listing # a tab-delim list of available data, in various versions,
          # species and formats

My vote for common options:
   version
   species (by scientific name, by taxon id, ?)
   format (default FASTA or GFF, as apropos)

And I realize my URL path syntax for specifying options is just candy 
coating over the CGI GET syntax, but I think that cleaner URLs are 
generally more humanly robust for memorization and publication.  We'll 
follow the community "spec" (whatever it is), but will likely implement 
the URL path syntax also (if necessary).

I also agree that we should provide both singular and plural forms of 
URL paths for convenience.

-Aaron

Don Gilbert wrote:

> Todd, others,
> 
> Sounds like there is general support.  For specifics, my own quick thoughts are it
> would be easier, maybe better, to start with one top level folder, and put the 
> options off of that, with some "MUST" options, and many "MAY" options.  I would probably
> prefer this sort of organization:
> 
>   my.org/genome{s}/    
>     -- return information/help page with links, MUST offer
>    { plurals are tricky - do people remember them? }
>   my.org/genome/dna  == return full genome dna in fasta format  
>          genome/genome == alias to /dna
>   my.org/genome/protein{s} == return " proteins in fasta
>          genome/proteome  == alias to proteins
>   my.org/genome/transcript{s} ==  transcripts in fasta
> 	 genome/transcriptome  == alias to transcripts 
>   my.org/genome/feature{s} ==  features in GFF
>          genome/gff  == alias to features
> 
>    -- other options as people want to support them
>        genome/versions/ ...
>        genome/species/ ..
>    -- and/or use CGI options for these "?species=x&version=y"
> 
> But we should decide what good common terms above and option names are.
> 
> -- Don
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Gmod-devel mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-devel

[Gmod-resources] Re: [GMOD-devel] Common MOD URL for genome data?

From: Lincoln S. <ls...@cs...> - 2005-03-07 23:03:37

Attachments: standard_urls.txt

Hi Folks,

Enclosed are my ideas about standard MOD URLs -- I think I wrote this 
way back in September 2004 and distributed it to the wormbase and 
gmod lists.

I like the various proposed extensions under genome/ (feature/dna, 
etc) .  I'm not so happy with providing both singular and plural 
forms, though.  It's a good principle to enforce a direct 
correspondence between a URL-based ID and the data content.

Lincoln

On Monday 07 March 2005 04:32 pm, Aaron J. Mackey wrote:
> in your example URL's, "my.org/genome(s)/..."
>
> Is this just to denote the special "data dump" section of the
> website? And if so, isn't there a more specific word to be used
> (e.g. "data" or "access" or "download" or such)?
>
> Also, I think the whole must/may criteria is moot.  Everything is
> "may", there's nothing I "must" do ;)
>
> some additional paths to possibly provide:
>
> /help    # Don's suggested default "index" page
> /version # Todd's suggestion of current release version
> /listing # a tab-delim list of available data, in various versions,
>           # species and formats
>
> My vote for common options:
>    version
>    species (by scientific name, by taxon id, ?)
>    format (default FASTA or GFF, as apropos)
>
> And I realize my URL path syntax for specifying options is just
> candy coating over the CGI GET syntax, but I think that cleaner
> URLs are generally more humanly robust for memorization and
> publication.  We'll follow the community "spec" (whatever it is),
> but will likely implement the URL path syntax also (if necessary).
>
> I also agree that we should provide both singular and plural forms
> of URL paths for convenience.
>
> -Aaron
>
> Don Gilbert wrote:
> > Todd, others,
> >
> > Sounds like there is general support.  For specifics, my own
> > quick thoughts are it would be easier, maybe better, to start
> > with one top level folder, and put the options off of that, with
> > some "MUST" options, and many "MAY" options.  I would probably
> > prefer this sort of organization:
> >
> >   my.org/genome{s}/
> >     -- return information/help page with links, MUST offer
> >    { plurals are tricky - do people remember them? }
> >   my.org/genome/dna  == return full genome dna in fasta format
> >          genome/genome == alias to /dna
> >   my.org/genome/protein{s} == return " proteins in fasta
> >          genome/proteome  == alias to proteins
> >   my.org/genome/transcript{s} ==  transcripts in fasta
> > 	 genome/transcriptome  == alias to transcripts
> >   my.org/genome/feature{s} ==  features in GFF
> >          genome/gff  == alias to features
> >
> >    -- other options as people want to support them
> >        genome/versions/ ...
> >        genome/species/ ..
> >    -- and/or use CGI options for these "?species=x&version=y"
> >
> > But we should decide what good common terms above and option
> > names are.
> >
> > -- Don
> >
> >
> > -------------------------------------------------------
> > SF email is sponsored by - The IT Product Guide
> > Read honest & candid reviews on hundreds of IT Products from real
> > users. Discover which products truly live up to the hype. Start
> > reading now.
> > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> > _______________________________________________
> > Gmod-devel mailing list
> > Gmo...@li...
> > https://lists.sourceforge.net/lists/listinfo/gmod-devel
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real
> users. Discover which products truly live up to the hype. Start
> reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Gmod-devel mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-devel

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <mic...@cs...> on
all emails regarding scheduling and other time-critical topics.