Re: [Treebase-devel] [eX-purgate bulk]: labels and titles and descriptions, oh my!

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On May 28, 2009, at 5:00 PM, Rutger Vos wrote:

> * tree.label => this is the token captured thusly from a nexus tree
> description: /U?TREE\s+(.\S+?)\s*=/, i.e. a token that is always
> present in all nexus versions, but it's short and needs to be
> nexus-safe (w.r.t. quotes, spaces, comments)

Correct. It is also user-editable during the submission process. It is  
usually short, and typically takes the form of "Fig. 2" or "Appendix  
A" or "PAUP 1".

> * tree.title => entered by users in the TreeBASE1 (and 2) interface,
> not parsed or serialized to/from nexus. (The TITLE token that occurs
> in mesquite-nexus is optional and in any case applies to the enclosing
> block.)

Correct. This is usually longer, and often reflects the title legend  
for the figure in the paper.

> * matrix.title => analogous to tree.title, i.e. entered by users. It's
> not the TITLE token from characters blocks (again, that's a
> mesquite-ism; we could use that as a default value during the initial
> upload, but in general it's user supplied)

Correct, although I think we made it so that when matrices are  
downloaded (= reconstructed), each character block is assigned a  
Mesquite-style TITLE with the contents of matrix.title.

> * matrix.description => a longer, user-supplied description that
> wasn't used in TreeBASE1, hence we've populated it with the legacy
> identifiers.

Correct.  The remaining confusing thing in matrix.x is that there are  
two different things similar to "Data Type". One is a user-entered  
data type (e.g. "Morphological", or "Nucleic Acid") and the other is a  
nexus-entered DATATYPE (e.g. "Standard" or "DNA"). While they look  
similar, they are not redundant -- e.g. there are many "Nucleic Acid"  
matrices that are still coded as "DATATYPE=STANDARD".

There's also a "Tree Type", "Tree Kind" and "Tree Quality" -- should  
be self-evident from browsing the data.

bp