On May 28, 2009, at 5:00 PM, Rutger Vos wrote:
> * tree.label => this is the token captured thusly from a nexus tree
> description: /U?TREE\s+(.\S+?)\s*=/, i.e. a token that is always
> present in all nexus versions, but it's short and needs to be
> nexus-safe (w.r.t. quotes, spaces, comments)
Correct. It is also user-editable during the submission process. It is
usually short, and typically takes the form of "Fig. 2" or "Appendix
A" or "PAUP 1".
> * tree.title => entered by users in the TreeBASE1 (and 2) interface,
> not parsed or serialized to/from nexus. (The TITLE token that occurs
> in mesquite-nexus is optional and in any case applies to the enclosing
> block.)
Correct. This is usually longer, and often reflects the title legend
for the figure in the paper.
> * matrix.title => analogous to tree.title, i.e. entered by users. It's
> not the TITLE token from characters blocks (again, that's a
> mesquite-ism; we could use that as a default value during the initial
> upload, but in general it's user supplied)
Correct, although I think we made it so that when matrices are
downloaded (= reconstructed), each character block is assigned a
Mesquite-style TITLE with the contents of matrix.title.
> * matrix.description => a longer, user-supplied description that
> wasn't used in TreeBASE1, hence we've populated it with the legacy
> identifiers.
Correct. The remaining confusing thing in matrix.x is that there are
two different things similar to "Data Type". One is a user-entered
data type (e.g. "Morphological", or "Nucleic Acid") and the other is a
nexus-entered DATATYPE (e.g. "Standard" or "DNA"). While they look
similar, they are not redundant -- e.g. there are many "Nucleic Acid"
matrices that are still coded as "DATATYPE=STANDARD".
There's also a "Tree Type", "Tree Kind" and "Tree Quality" -- should
be self-evident from browsing the data.
bp
|