From: William P. <wil...@ya...> - 2010-03-04 19:04:44
|
On Mar 4, 2010, at 12:50 PM, Rutger Vos wrote: > I don't think I would have expected T or N matrices: I've certainly > never seen a continuous matrix or a distance matrix. But obviously > very many (probably the majority) should be Q matrices. yeah, we don't have code for dealing with T. We do have code for dealing with N, but none of the TB1 data has N. New studies, however, should be able to submit N-type matrices. We have plenty of nucleotide or amino acid matrices, but I think they are all treated as S. This is because I think the real distinction is that S is where each scoring is in its own matrix-element record; while Q is where rows are concatenated into long strings and stored in text fields. We reserved Q as the solution in the event that our software could not perform well enough to store large DNA matrices as S type. (storing a long string as text is obviously more efficient). The downside of Q is that you're limited to 26 + 10 character states (unless we invented a special type of column delimiter), so our first effort was to try to get all discrete data into S. S is more cleanly normalized, but takes up a lot more memory. bp |