Re: [Rdkit-discuss] Align SDF to user-supplied template coordinates (2D)
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2010-08-16 12:11:01
|
Hi James, On Mon, Aug 16, 2010 at 1:09 PM, James Davidson <J.D...@ve...> wrote: > > I am currently struggling with something that I expect is very easy to solve > (I have just got back from holiday, so I think my brain isn't quite in the > zone!) yes, I know that feeling well. > I am trying to read in an SDF and align each molecule to a template scaffold > provided in molfile format. I want to supply a tool that allows a user to > sketch in a template and view their SDF entries in 2D all aligned (where > there is a match) to the supplied template. > > I have essentially followed this entry in the Chemistry Toolkit Rosetta - > http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructure, > which in essence is pretty-much the same as the info in the RDKit > documentation. > > However, when I am using pre-supplied 2D coordinates for the template, > rather than generating them from the first substructure match (as in the CTR > example), I find that the alignment proceeds as required, but there is a > mis-match between the scaling of the bond-lengths in the aligned > substructure compared with the rest of the molecule... Ah yes, the depictions that you get look rather silly, no? > Is there a way to 'scale' the molecules according to the template mol (or > alternatively scale the template according to the RDKit default)? Or is it > that I am tackling this in the wrong way? You're doing it correctly; no worries there. The problem is that most pieces of chemical drawing software generate 2D coordinates for molecules such that a C-C single bond is 1.0A long. The RDKit, on the other hand, sets the C-C single bond to be 1.5A long. The consequence is a depiction with a core that's smaller than it should be. It's straightforward to solve the problem: #----------------------- from rdkit import Chem from rdkit.Chem import AllChem core = Chem.MolFromMolFile('core.mol') # first scale the core so that a single bond is 1.5A: center = AllChem.ComputeCentroid(core.GetConformer()) import numpy tf = numpy.identity(4,numpy.float) tf[0][3] -= center[0] tf[1][3] -= center[1] tf[0][0] = tf[1][1] = tf[2][2] = 1.5 AllChem.TransformMol(core,tf) m = Chem.MolFromSmiles('c1cccc2c1nc(CC)cc2C(=O)O') from rdkit import Geometry coords = [core.GetConformer().GetAtomPosition(x) for x in range(core.GetNumAtoms())] coords2D = [Geometry.Point2D(pt.x,pt.y) for pt in coords] cd = {} match = m.GetSubstructMatch(core) for i,coord in enumerate(coords2D): cd[match[i]] = coord AllChem.Compute2DCoords(m,coordMap=cd) #----------------------- It should be possible to specify the scale used in the RDKit depictions so that these contortions are not necessary. I will put a feature request in for this and get it in the next version. -greg |