From: Rutger V. <rut...@gm...> - 2009-05-28 21:00:51
|
Hi, Val, MJD and I just had a meeting where we tried to reconstruct the origins and meaning of the different metadata text strings attached to treebase objects: tree.label, tree.title, matrix.title, matrix.description. I theorized the following, and am now looking for confirmation from Bill (this is all in the context of identifying fields that clients may search on through the web interface): * tree.label => this is the token captured thusly from a nexus tree description: /U?TREE\s+(.\S+?)\s*=/, i.e. a token that is always present in all nexus versions, but it's short and needs to be nexus-safe (w.r.t. quotes, spaces, comments) * tree.title => entered by users in the TreeBASE1 (and 2) interface, not parsed or serialized to/from nexus. (The TITLE token that occurs in mesquite-nexus is optional and in any case applies to the enclosing block.) * matrix.title => analogous to tree.title, i.e. entered by users. It's not the TITLE token from characters blocks (again, that's a mesquite-ism; we could use that as a default value during the initial upload, but in general it's user supplied) * matrix.description => a longer, user-supplied description that wasn't used in TreeBASE1, hence we've populated it with the legacy identifiers. The reason this question comes up is that we're trying to sketch out a small controlled vocabulary of search keys (the actual key strings to be matched to CDAO, DC and others (perhaps including an ontology of TB2-specific subclasses of CDAO terms)). We probably don't expect clients to know the difference between label, title and description so we might lump them into a more generic "description" key that is matched against all of them. Rutger -- Dr. Rutger A. Vos Department of zoology University of British Columbia http://www.nexml.org http://rutgervos.blogspot.com |