Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2010-09-17 03:37:17
|
Dear Hari, On Thu, Sep 16, 2010 at 8:44 PM, hari jayaram <ha...@gm...> wrote: > I am working with several ligands from a database stored in a SMILES > format. I am using the SMILES string to get three dimensional > coordinates (pdb format file) using a third-party program called > libcheck. > > For some of these molecules the SMILES string sterochemistry in the > database is entered in incorrectly such that the SMILES input to > libcheck returns a mangled coordinate file with rings clashing with > each other . Inputting SMILES string without the stereochemistry makes > libcheck behave correctly. > > Is there a way to use rdkit to cleanup the stereochemistry in the SMILES string. To be certain I understand: you would like to remove the stereochemistry from the SMILES string? One way to do this is to read in the SMILES then generate a new SMILES without stereochemistry information: [1]>>> from rdkit import Chem [2]>>> m = Chem.MolFromSmiles('Cl[C@H](F)Br') [3]>>> Chem.MolToSmiles(m) Out[3] 'FC(Cl)Br' A potential problem with this is that it changes the atom ordering. However, the simplest way to remove stereochemistry information from SMILES doesn't use the RDKit at all, you just remove "@" characters from the string: [4]>>> smi = 'Cl[C@H](F)Br' [5]>>> smi.replace('@','') Out[5] 'Cl[CH](F)Br' Hope this helps, -greg |