From: andrea s. <and...@gm...> - 2005-10-25 08:08:26
|
Hi all, is it normal that a script in pymol runs very very slow. In my case I am just selecting a residue of a protein and calculating the distance from the ligand atoms. I am doing this in a loop over 100 molecule. It took ca. 6 hours...! My memory is 50% free (1Gb total). I am using this script in shell without gui. thanks in advance Regards, andrea |
From: andrea s. <and...@gm...> - 2005-10-25 09:17:33
|
Jules Jacobsen wrote: > Hi Andrea, > > It depends on how you've written it.... It does sound more than a > little slow if you are just selecting a single residue. I had a script > which ended up looping through several sets of atoms which took ages > for larger models. I finally got fed-up with this and re-wrote it > properly. Now it's practically instantaneous. If you post the script > I'm sure someone will be able to offer you a pointer as to where you > might be going wrong. > > Jules Hi Jules, here it is: ------------------------------------------------- from pymol import cmd import string, sys, os if ("dist.out"): os.system("rm -f dist.out") if ("full.out"): os.system("rm -f full.out") nam = open("file.nam",'r') KeepStruc = [] readnam = nam.readlines() deep = int(15) for j in range(0,deep): readnam[j] = string.strip(readnam[j]) KeepStruc.append(readnam[j]) for i in KeepStruc: cmd.load(i,i) print i cmd.select("lig","segi B") cmd.select("pro","all and not segi B") amb = open("amb",'r') out = open("dist.out","a") full = open("full.out","a") out.write("%4s\n"%(i)) full.write("%4s\n"%(i)) count = 0 for lines in amb.readlines(): lines = string.strip(lines) tmp = string.split(lines,",") resname = tmp[0] atomtypes = tmp[1] tmp1 =string.split(atomtypes,":") atomlig = tmp1[0] atompro = tmp1[1] p = cmd.select("p","resi "+resname+" and name "+atompro+" and "+i) if (atomlig == "*"): ligand = cmd.select("ligand","segi B and "+i) atoms_lig = cmd.get_model('ligand') for atomsL in atoms_lig.atom: t = cmd.select("t","name "+atomsL.name) di = cmd.distance("di","t","p") full.write("%4s%8s%8s%8.3f%12s\n"%(resname,atompro,atomsL.name,di,"DISTANCE")) if (di < 4): count = count + 1 # print resname,atompro,count if (count == 0): out.write("%4s%8s%8s%12s\n"%(resname,atompro,"*",">VIOLATED")) count = 0 else: l = cmd.select("l","name "+atomlig+" and segi B and "+i) d = cmd.distance("dist_"+str(i),"p","l") full.write("%4s%8s%8s%8.3f%12s\n"%(resname,atompro,atomlig,d,"DISTANCE")) if (d > 5): out.write("%4s%8s%8s%8.3f%10s\n"%(resname,atompro,atomlig,d,"VIOLATED")) out.close() -------------------------------------------------- At the beginning I thought that problem was the memory because when I run it it starts fast and then aftr 4 structures it goes slowly slowly per every molecule, but I see my free memory always 50% and the cpu 100%. I tried to empty the list in the script at end of each iteration but it didn't work. thanks andrea |
From: andrea s. <and...@gm...> - 2005-10-25 09:42:16
|
Hi again, reading my post I found the bottleneck....shame on me :) here it is: ....... if (atomlig == "*"): ligand = cmd.select("ligand","segi B and "+i) atoms_lig = cmd.get_model('ligand') for atomsL in atoms_lig.atom: t = cmd.select("t","name "+atomsL.name) di = cmd.distance("di","t","p") ......... I am running twice the same loop. removing "for atomsL in atoms_lig.atom" loop it goes faster. The problem is that now I am getting the "di" value as average over all atoms of the ligand. How I can get the different values in order to find the minimum distance between the ligand and the selected residue ?? Thanks in advance andrea |
From: Gilleain T. <gil...@gm...> - 2005-10-25 11:04:19
|
Hi, I don't see how you can get rid of this loop actually. The thing that occurs to me is that you are loading structures, but not deleting them! So, at the other end of the "for i in KeepStruc" loop, you should have a "cmd.delete(i)". Other than that, you don't need to be opening the files "out" or "full" inside the loop, or re-reading the data from "amb". Hth. gilleain torrance On 25 Oct 2005, at 10:45, andrea spitaleri wrote: > Hi again, reading my post I found the bottleneck....shame on me :) > > here it is: > ....... > if (atomlig == "*"): > ligand = cmd.select("ligand","segi B and "+i) > atoms_lig = cmd.get_model('ligand') > for atomsL in atoms_lig.atom: > t = cmd.select("t","name "+atomsL.name) > di = cmd.distance("di","t","p") > ......... > I am running twice the same loop. removing "for atomsL in > atoms_lig.atom" loop it goes faster. The problem is that now I am > getting the "di" value as average over all atoms of the ligand. How > I can get the different values in order to find the minimum > distance between the ligand and the selected residue ?? > > Thanks in advance > > andrea > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > PyMOL-users mailing list > PyM...@li... > https://lists.sourceforge.net/lists/listinfo/pymol-users > |
From: andrea s. <and...@gm...> - 2005-10-25 12:49:47
|
Gilleain Torrance wrote: > Hi, > > I don't see how you can get rid of this loop actually. > > The thing that occurs to me is that you are loading structures, but > not deleting them! > > So, at the other end of the "for i in KeepStruc" loop, you should > have a "cmd.delete(i)". > > Other than that, you don't need to be opening the files "out" or > "full" inside the loop, or re-reading the data from "amb". > > Hth. > > gilleain torrance cmd.delete() did the job very well (now it takes few mins...rather than 7 hours..) The other suggestions did improve, but not too much. thanks a lot andrea |