[PyMOL] Secondary structure for C-alpha only and coarse-grained models

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hey :)

I thought there was a post recently about showing secondary structure in
C-alpha only models. I can't find it anymore, but in any case it's
something popping up now and again. For reconstruction of high-resolution
models out of coarse-grained ones, I came up with a very simple algorithm
for rebuilding the backbone based on C-alpha atoms only. I realized that
that could work for determining secondary structure, and made a function
out of it. It's not perfect, as sometimes a residue gets a flipped
backbone, but overall it seems to work pretty nicely. The following can be
put in a script (bbpredict.py):

from pymol import cmd
import math

def oprod(a,b):  return
a[1]*b[2]-a[2]*b[1],a[2]*b[0]-a[0]*b[2],a[0]*b[1]-a[1]*b[0]
def vsub(a,b):   return [i-j for i,j in zip(a,b)]
def cest(u,v,w): return [i+0.37*j+0.45*k for i,j,k in zip(u,v,w)]
def oest(u,v,w): return [i+0.37*j+1.75*k for i,j,k in zip(u,v,w)]
def hest(u,v,w): return [i-0.37*j-1.45*k for i,j,k in zip(u,v,w)]
def nest(u,v,w): return [i-0.37*j-0.45*k for i,j,k in zip(u,v,w)]
def norm(a):     return math.sqrt(sum([i**2 for i in a]))
def vdiv(a,b):   return [i/b for i in a]

def bbpredict(selection,name="bbpred"):
    m=cmd.get_model(selection+' and n. ca')
    x=m.get_coord_list()
    d=[(vsub(j,i),vsub(k,i)) for i,j,k in zip(x,x[1:],x[2:])]
    o=[oprod(*i) for i in d]
    no=[vdiv(i,norm(i)) for i in o]

    CA = [("ATOM  %5d  CA  %3s %1s%4d    "+("%8.3f%8.3f%8.3f"%tuple(i)))
for i in x]
    N  = [("ATOM  %5d  N   %3s %1s%4d
 "+("%8.3f%8.3f%8.3f"%tuple(nest(i,j[0],k)))) for i,j,k in zip(x[1:],d,no)]
    H  = [("ATOM  %5d  H   %3s %1s%4d
 "+("%8.3f%8.3f%8.3f"%tuple(hest(i,j[0],k)))) for i,j,k in zip(x[1:],d,no)]
    C  = [("ATOM  %5d  C   %3s %1s%4d
 "+("%8.3f%8.3f%8.3f"%tuple(cest(i,j[0],k)))) for i,j,k in zip(x,d,no)]
    O  = [("ATOM  %5d  O   %3s %1s%4d
 "+("%8.3f%8.3f%8.3f"%tuple(oest(i,j[0],k)))) for i,j,k in zip(x,d,no)]

    j     = m.atom[0]
    bb    = [CA[0]%(3,j.resn,j.chain,int(j.resi))]
    bb.append(C[0]%(4,j.resn,j.chain,int(j.resi)))
    bb.append(O[0]%(5,j.resn,j.chain,int(j.resi)))
    stuff = zip(m.atom[1:],zip(N,H,CA[1:],C[1:],O[1:]))
    bb.extend([ i%(num,j.resn,j.chain,int(j.resi)) for j,k in stuff for
i,num in zip(k,range(5)) ])

    cmd.read_pdbstr("\n".join(bb),name)

#### END

Then it can be used e.g. as follows (yes, a full atomistic structure for
demonstration):

run bbpredict.py
fetch 6lzm
bbpredict('6lzm','pred')
dss pred

I'll see if I can get the flips out and tidy the code up a bit. Then I'll
add this to the wiki. I hope it'll be of some use to anyone. The routine is
explained in a manuscript we just submitted.

Cheers,

Tsjerk

-- 
Tsjerk A. Wassenaar, Ph.D.

[PyMOL] Secondary structure for C-alpha only and coarse-grained models

PyMOL is an OpenGL based molecular visualization system

[PyMOL] Secondary structure for C-alpha only and coarse-grained models