From: Rutger V. <rut...@gm...> - 2011-02-15 03:22:53
|
Hi guys, Brian has an interesting use case: ---------- Forwarded message ---------- From: Brian T. Foley <bt...@la...> Date: Thu, Feb 10, 2011 at 11:44 PM Subject: Re: How do you align >100,000 sequences? To: Rutger Vos <rut...@gm...> Cc: bt...@la... OK. I guess I can see some uses of such a thing, even if it is not true phylogenetics, or the best method. Tree building is done in other fields besides biology, so there may be tools in use by computer scientists or librarians or something, that could work better or fast than the tools I am used to in biology. I was quite interested after poking around in TreeBase a bit yesterday. I still don't find it easy to find my way to the data sets I'd like, but the more I try the easier it is getting. Keep me, and the HIV Databases, in mind if you have questions about large sets. Viruses leave no fossils and they all look alike for the most part, so all we have is phylogeny, and we do a lot of it. It looks like TreeBase is more about storing data produced by others, rather than building new trees or helping researchers put together a new data set. HIV Database is rather the opposite, but I've long thought it would be very nice if we provided trees and NEXUS files. Maybe it would make sense to store them in TreeBase with a link to the TreeBase entry from our database. Brian > Dear Brian, > > I actually sent my query on behalf of someone else, so I can't vouch for > how > or why he did things the way he did them. I know that he has > Smith-Waterman > distances between all pairs of proteins in the set, but that he doesn't > actually have one multiple sequence alignment for the whole set. My > understanding is that the proteins are very, very divergent in some cases, > so I doubt trying to align them would make any sense at all (and, perhaps, > neither would using the SW distances as a metric on which to base a tree, > but that's his business). > > Best wishes, > > Rutger > -- Dr. Rutger A. Vos School of Biological Sciences Philip Lyle Building, Level 4 University of Reading Reading RG6 6BX United Kingdom Tel: +44 (0) 118 378 7535 http://www.nexml.org http://rutgervos.blogspot.com |