From: Rutger V. <rut...@gm...> - 2009-07-08 04:09:37
|
Sorry, the PhyloWS URL of the search example is (at present): http://8ball.sdsc.edu:6666/treebase-web/phylows/study/find?query=dcterms.identifier=S2484&format=rss1&recordSchema=tree On Tue, Jul 7, 2009 at 5:17 PM, Rutger Vos<rut...@gm...> wrote: > Hi Hilmar, all, > > thanks for your comments! > >>> I notice that this departs a bit from the phylows that is proposed here. >>> For example, the proposed phylows puts "/find/" before "/tree/", whereas >>> you have it the other way. >> >> Right, this is not in compliance with the spec. find/ comes first as it >> changes the resource from a record and its URI to a finder. > > Right, switching that around is fairly trivial, so I'll do that. > >> Also, find/taxon/ would imply that you are finding (and returning) taxa, >> which if I understand correctly is not the case - rather it seems you have >> one query parameter in the URI path (namely that you are searching by >> taxon?) and one in the query string. So if this is searching trees, it needs >> to be find/tree/, and if you are matching against taxon names, the query >> parameter needs to be tb.taxon.name or whatever the blessed metadata term >> for this purpose is. >> >> Third, recordSchema=tree means that you want records back in the tree >> schema. Unless you have invented that schema meanwhile, this is in all >> likelihood not what you want. Rather, the value should be nexml I suppose. >> find/tree already implies that you are finding (and returning) trees, so >> there is no point in expressing that redundantly in the query string. You >> might want to specify that you only want the tree and not also the matrix, >> but that would be a separate query parameter and should not be confounded >> with the return format. > > Mmmmm... I think this warrants a little more discussion. It's probably > true that for most implementors their searches can be conveniently > decomposed into several domains (tree search/matrix search/taxon > search/etc.) and that for each domain the metaphor is that of > searching a single table where the CQL indices are that table's > columns. > > Then, within each domain there is a limited number of concerns: how to > search on the provided indices and how to format the results. For > example, for a search like > http://8ball.sdsc.edu:6666/treebase-web/search/studySearch.html?query=dcterms.identifier=S2484&format=rss1&recordSchema=tree > the implementation is thus: > > * there is a self-contained study searcher > * the searcher knows how predicates map onto columns in the study > table (e.g. dcterms.identifier is the same as study.id) > * the searcher knows how to unpack a study object and get the trees out > > if instead we'd have phylows/tree/find?query=study.identifier=S2484, > the implementation would be something like: > > * there is a tree searcher > * the tree searcher needs to know not just about the tree table but > also about how all other predicates map onto all other tables, and how > they join with the tree table > * the tree searcher needs to know how to traverse study objects and > where trees are inside the study object > * (and similar overlap of concerns becomes necessary if we want the > trees for a given matrix, or for a taxon, or what have you) > > To me that seems like bad design. We'll lose any separation of concern > and might end up with a lot of redundancy between searchers - and a > lot more code (and bugs) to write. I realize that I'm overloading the > "recordSchema" token (and should fix that) but some way of saying > "search THIS domain and project the results into THAT domain" seems > very, very handy - especially because CQL doesn't have a notion of > joins. > > Rutger > > -- > Dr. Rutger A. Vos > Department of zoology > University of British Columbia > http://www.nexml.org > http://rutgervos.blogspot.com > -- Dr. Rutger A. Vos Department of zoology University of British Columbia http://www.nexml.org http://rutgervos.blogspot.com |