Re: [Treebase-devel] [PhyloWS] PhyloWS, CQL, NeXML on TreeBASE2

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Sorry, the PhyloWS URL of the search example is (at present):

http://8ball.sdsc.edu:6666/treebase-web/phylows/study/find?query=dcterms.identifier=S2484&format=rss1&recordSchema=tree

On Tue, Jul 7, 2009 at 5:17 PM, Rutger Vos<rut...@gm...> wrote:
> Hi Hilmar, all,
>
> thanks for your comments!
>
>>> I notice that this departs a bit from the phylows that is proposed here.
>>>  For example, the proposed phylows puts "/find/" before "/tree/", whereas
>>> you have it the other way.
>>
>> Right, this is not in compliance with the spec. find/ comes first as it
>> changes the resource from a record and its URI to a finder.
>
> Right, switching that around is fairly trivial, so I'll do that.
>
>> Also, find/taxon/ would imply that you are finding (and returning) taxa,
>> which if I understand correctly is not the case - rather it seems you have
>> one query parameter in the URI path (namely that you are searching by
>> taxon?) and one in the query string. So if this is searching trees, it needs
>> to be find/tree/, and if you are matching against taxon names, the query
>> parameter needs to be tb.taxon.name or whatever the blessed metadata term
>> for this purpose is.
>>
>> Third, recordSchema=tree means that you want records back in the tree
>> schema. Unless you have invented that schema meanwhile, this is in all
>> likelihood not what you want. Rather, the value should be nexml I suppose.
>> find/tree already implies that you are finding (and returning) trees, so
>> there is no point in expressing that redundantly in the query string. You
>> might want to specify that you only want the tree and not also the matrix,
>> but that would be a separate query parameter and should not be confounded
>> with the return format.
>
> Mmmmm... I think this warrants a little more discussion. It's probably
> true that for most implementors their searches can be conveniently
> decomposed into several domains (tree search/matrix search/taxon
> search/etc.) and that for each domain the metaphor is that of
> searching a single table where the CQL indices are that table's
> columns.
>
> Then, within each domain there is a limited number of concerns: how to
> search on the provided indices and how to format the results. For
> example, for a search like
> http://8ball.sdsc.edu:6666/treebase-web/search/studySearch.html?query=dcterms.identifier=S2484&format=rss1&recordSchema=tree
> the implementation is thus:
>
> * there is a self-contained study searcher
> * the searcher knows how predicates map onto columns in the study
> table (e.g. dcterms.identifier is the same as study.id)
> * the searcher knows how to unpack a study object and get the trees out
>
> if instead we'd have phylows/tree/find?query=study.identifier=S2484,
> the implementation would be something like:
>
> * there is a tree searcher
> * the tree searcher needs to know not just about the tree table but
> also about how all other predicates map onto all other tables, and how
> they join with the tree table
> * the tree searcher needs to know how to traverse study objects and
> where trees are inside the study object
> * (and similar overlap of concerns becomes necessary if we want the
> trees for a given matrix, or for a taxon, or what have you)
>
> To me that seems like bad design. We'll lose any separation of concern
> and might end up with a lot of redundancy between searchers - and a
> lot more code (and bugs) to write. I realize that I'm overloading the
> "recordSchema" token (and should fix that) but some way of saying
> "search THIS domain and project the results into THAT domain" seems
> very, very handy - especially because CQL doesn't have a notion of
> joins.
>
> Rutger
>
> --
> Dr. Rutger A. Vos
> Department of zoology
> University of British Columbia
> http://www.nexml.org
> http://rutgervos.blogspot.com
>

-- 
Dr. Rutger A. Vos
Department of zoology
University of British Columbia
http://www.nexml.org
http://rutgervos.blogspot.com