[Treebase-guts] SF.net SVN: treebase:[156] trunk/treebase-web/src/main/webapp/help/urlAPI. jsp

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Revision: 156
          http://treebase.svn.sourceforge.net/treebase/?rev=156&view=rev
Author:   rvos
Date:     2009-07-02 10:38:23 +0000 (Thu, 02 Jul 2009)

Log Message:
-----------
Adding documentation page for URL API

Added Paths:
-----------
    trunk/treebase-web/src/main/webapp/help/urlAPI.jsp

Added: trunk/treebase-web/src/main/webapp/help/urlAPI.jsp
===================================================================

--- trunk/treebase-web/src/main/webapp/help/urlAPI.jsp	                        (rev 0)
+++ trunk/treebase-web/src/main/webapp/help/urlAPI.jsp	2009-07-02 10:38:23 UTC (rev 156)
@@ -0,0 +1,110 @@
+<%@ include file="/common/taglibs.jsp"%>
+
+<title>URL API</title>
+<content tag="heading">URL API</content>
+<body id="submissions"/>
+<p>
+	The TreeBASE2 website provides users with simple ways to navigate the underlying data
+	programmatically. This page describes the stateless web service interface and URL architecture
+	that can be used to search the web site and obtain data in a variety of formats with rich semantics. 
+</p>
+<h3>PhyloWS support</h3>
+<p>
+	The site structure described here is designed to be compliant with the emerging 
+	<a href="http://evoinfo.nescent.org/PhyloWS">PhyloWS</a> standard. One of the tenets of the
+	standard is that URLs contain a <strong>/phylows/</strong> delimiter below which the standard
+	recommends a <a href="https://www.nescent.org/wg_evoinfo/PhyloWS/REST">simple API</a> to derefence 
+	phylogenetic data by their accession numbers. In the examples below, the url fragments come 
+	immediately below the <strong>/phylows/</strong> delimiter (everything between the 
+	<strong>http://</strong> and <strong>phylows</strong> is considered
+	subject to change, likely to be stabilized using <a href="http://purl.org">purl</a> addresses).
+</p>
+<h3>Site sections</h3>
+<p>The data on the TreeBASE2 website are organized in four subsections:</p>
+<ul>
+	<li><strong>taxon/</strong> <em>operational taxonomic units, taxonomic mappings and outlinks</em></li>
+	<li><strong>matrix/</strong> <em>character state matrices, morphological character definitions</em></li>
+	<li><strong>tree/</strong> <em>contains trees and tree nodes</em></li>
+	<li><strong>study/</strong> <em>full submission records, including citation and analysis records</em></li>
+</ul>
+<p>
+	Within those four sections, every item in the TreeBASE2 database can be de-referenced by appending
+	the item's full identifier to the right section name. For example, <strong>tree/TB2:Tr2227</strong> 
+	represents a tree (and returns a simple RDF file to describe the tree). For some classes of objects, 
+	these short addresses can be passed a <strong>format</strong>
+	parameter to specify in which data format to represent the object: 
+	<a href="/treebase-web/phylows/study/TB2:S1787?format=html">study/TB2:S1787?format=html</a>.
+	Identifiers that match any of the following expressions can be represented as <strong>nexml</strong>, 
+	<strong>nexus</strong>, <strong>rdf</strong> or <strong>html</strong> (i.e. in a web page):
+</p>
+<ul>
+	<li><strong>matrix/TB2:M[0-9]+</strong> <em>character state matrix</em></li>
+	<li><strong>tree/TB2:Tr[0-9]+</strong> <em>phylogenetic tree</em></li>
+	<li><strong>study/TB2:S[0-9]+</strong> <em>study record</em></li>
+</ul>
+<h3>NeXML support</h3>
+<p>
+	The <strong>nexml</strong> and the <strong>rdf</strong> download options both use output 
+	generated by the java support libraries available from the 
+	<a href="http://nexml.org/nexml/java">nexml website</a>. The website uses the nexml annotation 
+	feature extensively to transmit all the metadata stored by the database. Nexml annotations
+	are <a href="http://www.w3.org/TR/xhtml-rdfa-primer/">RDFa</a> compliant element structures
+	that use <a href="http://www.w3.org/TR/curie/">CURIE</a> strings to identify metadata properties, 
+	and @content attributes to store the property value. For example, this (simplified) annotation:
+	<strong>
+	&lt;meta content="uBio:2538170" property="tb:identifier.ubio"/&gt;
+	</strong>
+	means that the element that encloses it has a special kind of identifier attached to it, namely
+	one that TreeBASE recognizes as originating in <a href="/treebase-web/phylows/taxon/uBio:2538170">uBio</a>. 
+</p>
+<p>
+	The salient part is
+	the CURIE string predicate <strong>tb:identifier.ubio</strong>, which is one of a 
+	<a href="http://spreadsheets.google.com/pub?key=rL--O7pyhR8FcnnG5-ofAlw">long list</a> of
+	proposed predicates that are written in TreeBASE's NeXML output and can be used as 
+	<a href="http://www.loc.gov/standards/sru/specs/cql.html">CQL</a> search predicates. The predicates
+	proposed (and now experimentally transmitted) are intended to be subclasses of predicates
+	from commonly used vocabularies. For example, <strong>tb:identifier.ubio</strong> inherits from
+	<a href="http://dublincore.org/documents/dcmi-terms/#terms-identifier">dcterms:identifier</a> and
+	so any of the latter's semantics apply to the former, which is refined to indicate that the
+	value is a uBio namebank ID.
+</p>
+<h3>Searching</h3>
+<p>
+	The TreeBASE website can be searched using a subset of constructs from the 
+	<a href="http://www.loc.gov/standards/sru/specs/cql.html">CQL</a> specification. Specifically,
+	the predicates 
+	<a href="http://spreadsheets.google.com/pub?key=rL--O7pyhR8FcnnG5-ofAlw">listed here with 
+	an asterisk</a> can be used in statements in the site section they apply to, such that, for example
+	a taxon can be retrieved by its ncbi ID like so:
+	<div style="background-color:;padding:10px">
+		<strong>taxon/find?query=tb:identifier.ncbi=<em>&lt;ncbi taxon id&gt;</em></strong>
+	</div>
+	or by its name like so:
+	<div style="background-color:;padding:10px">
+		<strong>taxon/find?query=tb:title.taxon=<em>&lt;name&gt;</em></strong>
+	</div>
+	or using an exact match 
+	(<strong>==</strong>) or a case-insensitive one (<strong>=/ignoreCase</strong>). These statements
+	can be combined with boolean <strong>and</strong>, <strong>or</strong> and <strong>not</strong>.
+	For example:
+	<div style="background-color:;padding:10px">
+		<strong>study/find?query=dcterms.contributor=Huelsenbeck or dcterms.contributor=Ronquist</strong>
+	</div>
+	Finally, searching can be modified to project the results from one section info those of another. The
+	effect is roughly the same as switching between tabs in the search section: if the results are a
+	list of tree and you click on the matrix search tab, the trees are converted to the set of matrices
+	on which the trees are based. This behaviour can be used by specifying the 
+	<strong>recordSchema=&lt;section&gt;</strong> argument, i.e.:
+	<div style="background-color:;padding:10px">
+		<strong>taxon/find?query=dcterms.title=="Homo sapiens"&amp;recordSchema=tree</strong>
+	</div>	
+	returns all the trees that have <em>Homo sapiens</em> in them.
+	By default, all these queries return a web page, but with a <strong>format=rss1</strong> argument
+	the search results are listed in an RDF compatible RSS1.0 file, i.e.:
+	<div style="background-color:;padding:10px">
+		<strong>taxon/find?query=tb:title.taxon=<em>&lt;name&gt;</em>&amp;format=rss1</strong>
+	</div>	
+	The returned results in RSS1.0 use the short urls of the form <strong>&lt;section&gt;/&lt;id&gt;</strong>, whose
+	returned resource descriptions (like <a href="/treebase-web/phylows/tree/TB2:Tr2227">
+	this</a> one) need to be scanned to discover suitable serialization formats. 
\ No newline at end of file


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.