[Rdkit-devel] "Iterative" stereochemistry
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2008-08-25 13:21:05
|
Dear all, One of the long-time gaps/bugs in the RDKit handling of stereochemistry has been what I call "dependant stereochemistry" : atoms or bonds that are stereogenic because some of their neighbors are stereogenic. A very simple, and well known, example is the molecule defined by the SMILES: C[C@H]1[C@@H](F)CCC[C@H]1F Carbon 1 (numbering from zero) here is a chiral center (absolute stereochemistry S, or s, depending on which notation you use) because its two neighbors are chiral centers with different chirality (one is R, the other S). Another example, this time with double bonds: Cl\C=C(/C=C/F)/C=C\F The second and third double bonds are E and Z, respectively. The first bond is Z, but only because of the stereochemistry of the other two bonds. You can further elaborate this to double bonds that are stereogenic because of the chirality of attached atoms: C\C=C([C@@](C)(F)Br)/[C@@](Br)(F)C or atoms that are chiral because of the stereochemistry of attached bonds: C[C@](/C=C/C)(F)/C=C\C I'm pretty sure this can be pretty much arbitrarily elaborated. It's enough to make a cheminformatician's heart go pitter-patt. Handling these cases in the RDKit required a restructuring of the stereochemistry perception code (which made that "pitter-patt" feel more like palpitations) and some changes to the client-visible interface. Specifically, the former division between perceiving atom chirality and double bond stereochemistry no longer makes sense. Since this is a pretty deep and complex change, I created a separate branch for the work: http://rdkit.svn.sourceforge.net/viewvc/rdkit/branches/IterativeChirality_20Aug2008/ I believe the implementation that's currently checked in there is correct. I added test cases for each of the scenarios I could think of and those all pass. I still need to do a bit of optimization work, but that should not affect the results. Before merging this into the core, I'd like to ask anyone who has time and interest to try it out and let me know if you find problems. When testing, please keep in mind that not all cheminformatics systems handle these cases correctly. I have checked Marvin, openbabel (v2.2.0), and ChemDraw, and only ChemDraw gets all of the test cases right. Best regards, -greg |