I should say that this isn't a PyMOL-specific question, for that I apologize but I imagine several users on this mailing list will have experience

in sequence/structure analysis, that is the reason for posting it. I am using a database that stores motifs sequences describing several proteins.

I use PyMOL to visualize how well these motifs coincide with functional residues (ligand-binding, protein coupling, oligomerization, etc)

What types of statistical distributions would be appropriate for analyzing data from such investigations?

I can think of a simplistic case where the binomial distribution could be used to calculate the probability that a certain residue will fall within a motif by chance alone where

Prob = average motif length /average overall sequence length.

This assumes however that position of ligand-binding residues are independent of one another, whereas in fact it is likely

that functional residues form in clusters? Is there any statistical methodology that would be appropriate and take this into account?

Again, sorry for the slightly off-topic question and many thanks.

Spyros