Re: [Treebase-devel] Are "matrixtypes" ignored?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Mar 4, 2010, at 12:50 PM, Rutger Vos wrote:

> I don't think I would have expected T or N matrices: I've certainly
> never seen a continuous matrix or a distance matrix. But obviously
> very many (probably the majority) should be Q matrices.

yeah, we don't have code for dealing with T. We do have code for dealing with N, but none of the TB1 data has N. New studies, however, should be able to submit N-type matrices. 

We have plenty of nucleotide or amino acid matrices, but I think they are all treated as S. This is because I think the real distinction is that S is where each scoring is in its own matrix-element record; while Q is where rows are concatenated into long strings and stored in text fields. We reserved Q as the solution in the event that our software could not perform well enough to store large DNA matrices as S type. (storing a long string as text is obviously more efficient). The downside of Q is that you're limited to 26 + 10 character states (unless we invented a special type of column delimiter), so our first effort was to try to get all discrete data into S.  S is more cleanly normalized, but takes up a lot more memory. 

bp