From: William P. <wil...@ya...> - 2009-05-28 21:22:38
|
On May 28, 2009, at 5:00 PM, Rutger Vos wrote: > * tree.label => this is the token captured thusly from a nexus tree > description: /U?TREE\s+(.\S+?)\s*=/, i.e. a token that is always > present in all nexus versions, but it's short and needs to be > nexus-safe (w.r.t. quotes, spaces, comments) Correct. It is also user-editable during the submission process. It is usually short, and typically takes the form of "Fig. 2" or "Appendix A" or "PAUP 1". > * tree.title => entered by users in the TreeBASE1 (and 2) interface, > not parsed or serialized to/from nexus. (The TITLE token that occurs > in mesquite-nexus is optional and in any case applies to the enclosing > block.) Correct. This is usually longer, and often reflects the title legend for the figure in the paper. > * matrix.title => analogous to tree.title, i.e. entered by users. It's > not the TITLE token from characters blocks (again, that's a > mesquite-ism; we could use that as a default value during the initial > upload, but in general it's user supplied) Correct, although I think we made it so that when matrices are downloaded (= reconstructed), each character block is assigned a Mesquite-style TITLE with the contents of matrix.title. > * matrix.description => a longer, user-supplied description that > wasn't used in TreeBASE1, hence we've populated it with the legacy > identifiers. Correct. The remaining confusing thing in matrix.x is that there are two different things similar to "Data Type". One is a user-entered data type (e.g. "Morphological", or "Nucleic Acid") and the other is a nexus-entered DATATYPE (e.g. "Standard" or "DNA"). While they look similar, they are not redundant -- e.g. there are many "Nucleic Acid" matrices that are still coded as "DATATYPE=STANDARD". There's also a "Tree Type", "Tree Kind" and "Tree Quality" -- should be self-evident from browsing the data. bp |