From: <rv...@us...> - 2009-07-02 10:38:55
|
Revision: 156 http://treebase.svn.sourceforge.net/treebase/?rev=156&view=rev Author: rvos Date: 2009-07-02 10:38:23 +0000 (Thu, 02 Jul 2009) Log Message: ----------- Adding documentation page for URL API Added Paths: ----------- trunk/treebase-web/src/main/webapp/help/urlAPI.jsp Added: trunk/treebase-web/src/main/webapp/help/urlAPI.jsp =================================================================== --- trunk/treebase-web/src/main/webapp/help/urlAPI.jsp (rev 0) +++ trunk/treebase-web/src/main/webapp/help/urlAPI.jsp 2009-07-02 10:38:23 UTC (rev 156) @@ -0,0 +1,110 @@ +<%@ include file="/common/taglibs.jsp"%> + +<title>URL API</title> +<content tag="heading">URL API</content> +<body id="submissions"/> +<p> + The TreeBASE2 website provides users with simple ways to navigate the underlying data + programmatically. This page describes the stateless web service interface and URL architecture + that can be used to search the web site and obtain data in a variety of formats with rich semantics. +</p> +<h3>PhyloWS support</h3> +<p> + The site structure described here is designed to be compliant with the emerging + <a href="http://evoinfo.nescent.org/PhyloWS">PhyloWS</a> standard. One of the tenets of the + standard is that URLs contain a <strong>/phylows/</strong> delimiter below which the standard + recommends a <a href="https://www.nescent.org/wg_evoinfo/PhyloWS/REST">simple API</a> to derefence + phylogenetic data by their accession numbers. In the examples below, the url fragments come + immediately below the <strong>/phylows/</strong> delimiter (everything between the + <strong>http://</strong> and <strong>phylows</strong> is considered + subject to change, likely to be stabilized using <a href="http://purl.org">purl</a> addresses). +</p> +<h3>Site sections</h3> +<p>The data on the TreeBASE2 website are organized in four subsections:</p> +<ul> + <li><strong>taxon/</strong> <em>operational taxonomic units, taxonomic mappings and outlinks</em></li> + <li><strong>matrix/</strong> <em>character state matrices, morphological character definitions</em></li> + <li><strong>tree/</strong> <em>contains trees and tree nodes</em></li> + <li><strong>study/</strong> <em>full submission records, including citation and analysis records</em></li> +</ul> +<p> + Within those four sections, every item in the TreeBASE2 database can be de-referenced by appending + the item's full identifier to the right section name. For example, <strong>tree/TB2:Tr2227</strong> + represents a tree (and returns a simple RDF file to describe the tree). For some classes of objects, + these short addresses can be passed a <strong>format</strong> + parameter to specify in which data format to represent the object: + <a href="/treebase-web/phylows/study/TB2:S1787?format=html">study/TB2:S1787?format=html</a>. + Identifiers that match any of the following expressions can be represented as <strong>nexml</strong>, + <strong>nexus</strong>, <strong>rdf</strong> or <strong>html</strong> (i.e. in a web page): +</p> +<ul> + <li><strong>matrix/TB2:M[0-9]+</strong> <em>character state matrix</em></li> + <li><strong>tree/TB2:Tr[0-9]+</strong> <em>phylogenetic tree</em></li> + <li><strong>study/TB2:S[0-9]+</strong> <em>study record</em></li> +</ul> +<h3>NeXML support</h3> +<p> + The <strong>nexml</strong> and the <strong>rdf</strong> download options both use output + generated by the java support libraries available from the + <a href="http://nexml.org/nexml/java">nexml website</a>. The website uses the nexml annotation + feature extensively to transmit all the metadata stored by the database. Nexml annotations + are <a href="http://www.w3.org/TR/xhtml-rdfa-primer/">RDFa</a> compliant element structures + that use <a href="http://www.w3.org/TR/curie/">CURIE</a> strings to identify metadata properties, + and @content attributes to store the property value. For example, this (simplified) annotation: + <strong> + <meta content="uBio:2538170" property="tb:identifier.ubio"/> + </strong> + means that the element that encloses it has a special kind of identifier attached to it, namely + one that TreeBASE recognizes as originating in <a href="/treebase-web/phylows/taxon/uBio:2538170">uBio</a>. +</p> +<p> + The salient part is + the CURIE string predicate <strong>tb:identifier.ubio</strong>, which is one of a + <a href="http://spreadsheets.google.com/pub?key=rL--O7pyhR8FcnnG5-ofAlw">long list</a> of + proposed predicates that are written in TreeBASE's NeXML output and can be used as + <a href="http://www.loc.gov/standards/sru/specs/cql.html">CQL</a> search predicates. The predicates + proposed (and now experimentally transmitted) are intended to be subclasses of predicates + from commonly used vocabularies. For example, <strong>tb:identifier.ubio</strong> inherits from + <a href="http://dublincore.org/documents/dcmi-terms/#terms-identifier">dcterms:identifier</a> and + so any of the latter's semantics apply to the former, which is refined to indicate that the + value is a uBio namebank ID. +</p> +<h3>Searching</h3> +<p> + The TreeBASE website can be searched using a subset of constructs from the + <a href="http://www.loc.gov/standards/sru/specs/cql.html">CQL</a> specification. Specifically, + the predicates + <a href="http://spreadsheets.google.com/pub?key=rL--O7pyhR8FcnnG5-ofAlw">listed here with + an asterisk</a> can be used in statements in the site section they apply to, such that, for example + a taxon can be retrieved by its ncbi ID like so: + <div style="background-color:;padding:10px"> + <strong>taxon/find?query=tb:identifier.ncbi=<em><ncbi taxon id></em></strong> + </div> + or by its name like so: + <div style="background-color:;padding:10px"> + <strong>taxon/find?query=tb:title.taxon=<em><name></em></strong> + </div> + or using an exact match + (<strong>==</strong>) or a case-insensitive one (<strong>=/ignoreCase</strong>). These statements + can be combined with boolean <strong>and</strong>, <strong>or</strong> and <strong>not</strong>. + For example: + <div style="background-color:;padding:10px"> + <strong>study/find?query=dcterms.contributor=Huelsenbeck or dcterms.contributor=Ronquist</strong> + </div> + Finally, searching can be modified to project the results from one section info those of another. The + effect is roughly the same as switching between tabs in the search section: if the results are a + list of tree and you click on the matrix search tab, the trees are converted to the set of matrices + on which the trees are based. This behaviour can be used by specifying the + <strong>recordSchema=<section></strong> argument, i.e.: + <div style="background-color:;padding:10px"> + <strong>taxon/find?query=dcterms.title=="Homo sapiens"&recordSchema=tree</strong> + </div> + returns all the trees that have <em>Homo sapiens</em> in them. + By default, all these queries return a web page, but with a <strong>format=rss1</strong> argument + the search results are listed in an RDF compatible RSS1.0 file, i.e.: + <div style="background-color:;padding:10px"> + <strong>taxon/find?query=tb:title.taxon=<em><name></em>&format=rss1</strong> + </div> + The returned results in RSS1.0 use the short urls of the form <strong><section>/<id></strong>, whose + returned resource descriptions (like <a href="/treebase-web/phylows/tree/TB2:Tr2227"> + this</a> one) need to be scanned to discover suitable serialization formats. \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |