Dear all,
One of the long-time gaps/bugs in the RDKit handling of
stereochemistry has been what I call "dependant stereochemistry" :
atoms or bonds that are stereogenic because some of their neighbors
are stereogenic.
A very simple, and well known, example is the molecule defined by the SMILES:
C[C@H]1[C@@H](F)CCC[C@H]1F
Carbon 1 (numbering from zero) here is a chiral center (absolute
stereochemistry S, or s, depending on which notation you use) because
its two neighbors are chiral centers with different chirality (one is
R, the other S).
Another example, this time with double bonds:
Cl\C=C(/C=C/F)/C=C\F
The second and third double bonds are E and Z, respectively. The first
bond is Z, but only because of the stereochemistry of the other two
bonds.
You can further elaborate this to double bonds that are stereogenic
because of the chirality of attached atoms:
C\C=C([C@@](C)(F)Br)/[C@@](Br)(F)C
or atoms that are chiral because of the stereochemistry of attached bonds:
C[C@](/C=C/C)(F)/C=C\C
I'm pretty sure this can be pretty much arbitrarily elaborated. It's
enough to make a cheminformatician's heart go pitter-patt.
Handling these cases in the RDKit required a restructuring of the
stereochemistry perception code (which made that "pitter-patt" feel
more like palpitations) and some changes to the client-visible
interface. Specifically, the former division between perceiving atom
chirality and double bond stereochemistry no longer makes sense.
Since this is a pretty deep and complex change, I created a separate
branch for the work:
http://rdkit.svn.sourceforge.net/viewvc/rdkit/branches/IterativeChirality_20Aug2008/
I believe the implementation that's currently checked in there is
correct. I added test cases for each of the scenarios I could think of
and those all pass. I still need to do a bit of optimization work, but
that should not affect the results.
Before merging this into the core, I'd like to ask anyone who has time
and interest to try it out and let me know if you find problems. When
testing, please keep in mind that not all cheminformatics systems
handle these cases correctly. I have checked Marvin, openbabel
(v2.2.0), and ChemDraw, and only ChemDraw gets all of the test cases
right.
Best regards,
-greg
|