From: Jason Vertrees <jason.vertrees@sc...>  20110716 05:25:33

Hi Spyros, > Is there any statistical > methodology that would be appropriate and take this into account? It's been a while since I've had to consider such distributions. Many folks like the following text for sequencerelated problems: Biological sequence analysis: probabilistic models of proteins and nucleic acids By Richard Durbin  Cambridge University Press (1998)  Paperback  356 pages  ISBN 0521629713 Give it a shot & good luck! Cheers,  Jason On Tue, Jul 12, 2011 at 7:38 PM, Spyros Charonis <s.charonis@...> wrote: > Dear PyMOLers, > I should say that this isn't a PyMOLspecific question, for that I apologize > but I imagine several users on this mailing list will have experience > in sequence/structure analysis, that is the reason for posting it. I am > using a database that stores motifs sequences describing several proteins. > I use PyMOL to visualize how well these motifs coincide with functional > residues (ligandbinding, protein coupling, oligomerization, etc) > What types of statistical distributions would be appropriate for analyzing > data from such investigations? > I can think of a simplistic case where the binomial distribution could be > used to calculate the probability that a certain residue will fall within a > motif by chance alone where > Prob = average motif length /average overall sequence length. > This assumes however that position of ligandbinding residues are > independent of one another, whereas in fact it is likely > that functional residues form in clusters? Is there any statistical > methodology that would be appropriate and take this into account? > Again, sorry for the slightly offtopic question and many thanks. > Spyros  Jason Vertrees, PhD PyMOL Product Manager Schrodinger, LLC (e) Jason.Vertrees@... (o) +1 (603) 3747120 