[Rdkit-discuss] Isomeric smiles and explicit hydrogens
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Noel O'B. <bao...@gm...> - 2008-04-14 10:50:25
|
I've been trying to get my head around what's happening when I read and write isomeric smiles. As a user, I hope that the same molecule will also have the same isomeric SMILES. However, look at the following examples using cinfony which read a SMILES string and write an isomeric SMILES string... I'm trying to specify the chirality of the carbon in chlorobromomethane, but RDKit is not picking up on the chirality: >>> rdk.readstring("smi", "[C](Cl)Br").write("iso") 'ClCBr' (No chirality, as expected) >>> rdk.readstring("smi", "[C@@H](Cl)Br").write("iso") 'Cl[CH]Br' >>> rdk.readstring("smi", "[C@](Cl)Br").write("iso") 'ClCBr' >>> rdk.readstring("smi", "Cl[C@]Br").write("iso") 'ClCBr' >>> rdk.readstring("smi", "Cl[C@@H]Br").write("iso") 'Cl[CH]Br' (Expected chirality, but didn't get it) Let's try 1-chloro,1-bromoethane: >>> rdk.readstring("smi", "Cl[C@@](Br)C").write("iso") 'CC(Cl)Br' (Expected chirality, but didn't get it) >>> rdk.readstring("smi", "Cl[C@@H](Br)C").write("iso") 'C[C@@H](Cl)Br' (Expected chirality, and got it) Is the problem with me or with RDKit? On a related note, I have found that RDKit, when reading SDF files, turns all of the hydrogens into implicit hydrogens. However, when reading SMILES strings, it retains any explicit hydrogens specified in C@@H expressions. This doesn't seem to be consistent and requires the user to remove hydrogens if he/she wants to create a canonical smiles string. Apologies in advance if my understanding of SMILES is shaky. Regards, Noel |