Re: [Rdkit-discuss] Distance geometry / Graph theory Functions
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2008-04-14 05:00:53
|
Markus, Here's a somewhat longer and more detailed answer to your question about directionality of feature points. First a couple of remarks about features themselves (mainly repeating section 7 of the "Getting Started with the RDKit in Python" document: The class for pharmacophore features is MolChemicalFeature (found in the module Chem.rdMolChemicalFeatures). These features are normally built using a MolChemicalFeatureFactory (same module). A feature factory uses a series of SMARTS-based definitions to define chemical features. There's a sample file in $RDBASE/Data/BaseFeatures.fdef and the format of the files is defined in $RDBASE/Docs/Programs/RDPharm3D/FDefFile.htm Here's some sample code for finding chemical features: >>> import Chem >>> from Chem import ChemicalFeatures >>> import RDConfig >>> import os >>> fdefName = os.path.join(RDConfig.RDDataDir,'Base.fdef') >>> factory = ChemicalFeatures.BuildFeatureFactory(fdefName) >>> m = Chem.MolFromMolFile('test.mol') >>> feats = factory.GetFeaturesForMol(m) >>> feats[0].GetFamily() 'Donor' >>> feats[0].GetType() 'SingleAtomDonor' >>> feats[0].GetPos() <Geometry.rdGeometry.Point3D object at 0x00B33190> You see that the positions of the features are defined by Point3D objects. These are what we can use to calculate feature directions (a.k.a. Projected Pharmacophore Points). > I wonder how to do that. I read the .cpp files (not very funny for a > scripting-newbie :-) ) and found a lot of functions for related problems. Yes, I can imagine that wasn't much fun at all. You're better off reading the .h files or looking through the api documentation for the c++, but that's only marginally easier. > I guess that there are already some functions, that calculate the > cartesian coordinates for a point, that is defined by angles and > distances towards some other points. Indeed! The code to handle geometry is part of the Point3D object: [6]>>> help(Geometry.Point3D) Help on class Point3D in module Geometry.rdGeometry: class Point3D(Boost.Python.instance) | A class to represent a three-dimensional point | The x, y, and z coordinates can be read and written using either attributes | (i.e. pt.x = 4) or indexing (i.e. pt[0] = 4). | | Method resolution order: | Point3D | Boost.Python.instance | __builtin__.object | | Methods defined here: | | AngleTo(...) | AngleTo( (Point3D)arg1, (Point3D)arg2) -> float : | determines the angle between a vector to this point (between 0 and PI) | | C++ signature : | double AngleTo(RDGeom::Point3D {lvalue},RDGeom::Point3D) | | CrossProduct(...) | CrossProduct( (Point3D)arg1, (Point3D)arg2) -> Point3D : | Get the cross product between two points | | C++ signature : | RDGeom::Point3D CrossProduct(RDGeom::Point3D {lvalue},RDGeom::Point3D) | | DirectionVector(...) | DirectionVector( (Point3D)arg1, (Point3D)arg2) -> Point3D : | return a normalized direction vector from this point to another | | C++ signature : | RDGeom::Point3D DirectionVector(RDGeom::Point3D {lvalue},RDGeom::Point3D) | | Distance(...) | Distance( (Point3D)arg1, (Point3D)arg2) -> float : | Distance from this point to another point | | C++ signature : | double Distance(RDGeom::Point3D,RDGeom::Point3D) | | DotProduct(...) | DotProduct( (Point3D)arg1, (Point3D)arg2) -> float : | Dot product with another point | | C++ signature : | double DotProduct(RDGeom::Point3D {lvalue},RDGeom::Point3D) | | Length(...) | Length( (Point3D)arg1) -> float : | Length of the vector | | C++ signature : | double Length(RDGeom::Point3D {lvalue}) | | LengthSq(...) | LengthSq( (Point3D)arg1) -> float : | Square of the length | | C++ signature : | double LengthSq(RDGeom::Point3D {lvalue}) | | Normalize(...) | Normalize( (Point3D)arg1) -> None : | Normalize the vector (using L2 norm) | | C++ signature : | void Normalize(RDGeom::Point3D {lvalue}) | | SignedAngleTo(...) | SignedAngleTo( (Point3D)arg1, (Point3D)arg2) -> float : | determines the signed angle between a vector to this point (between 0 and 2*PI) | | C++ signature : | double SignedAngleTo(RDGeom::Point3D {lvalue},RDGeom::Point3D) You also need to know how to get the vector between two Point3Ds (also a Point3D): [9]>>> p1 = Geometry.Point3D(1.0,2.0,4.0) [10]>>> p2 = Geometry.Point3D(-1.0,0,4.0) [11]>>> p2-p1 Out[11] <Geometry.rdGeometry.Point3D object at 0x844af7c> [12]>>> p3=p2-p1 [13]>>> list(p3) Out[13] [-2.0, -2.0, 0.0] > As to my understanding I could just for example take a C=O fragment, > feed the cartesians of the two atoms, define the angles between the C=O > bond and the free lone pairs of the oxygen plus some tetrahedral as well > as the optimal distance of a H-Bond and thus get the coordinates of the > projected point. Sounds simple, but how to implement it using the RDKit > python Library? As I mentioned on Friday: there's already code in the RDKit for calculating feature directions that can help you get started here. It's in $RDBASE/Python/Chem/Features/FeatDirUtilsRD.py. One thing to know about this file is that it's a quick port of some work I did for feature directions using another cheminformatics system, so it's not the prettiest code. Apologies in advance for that. If you have PyMol installed on your machine (highly recommended, it's a great visualizer) and start it using the -R argument (this causes PyMol to run an xml-rpc server so that it can be easily controlled from other software), you can use $RDBASE/Python/Chem.Features/ShowFeats.py to display the features and their directions on a molecule: % python $RDBASE/Python/Chem/Features/ShowFeats.py --fdef=$RDBASE/Data/BaseFeatures.fdef test.mol I hope this is enough to get you started. Feel free to post with questions. -greg |