Re: [Rdkit-discuss] sanitization removes Hs - is this expected?
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Michal K. <mic...@gm...> - 2014-02-25 13:42:43
|
Thanks Greg, this is exactly what I wanted to know. Would you consider adding an optional removeHs argument to MolFromSmiles(), as in mol/mol2/sdf parsers? Best wishes, Michal On 25 February 2014 04:23, Greg Landrum <gre...@gm...> wrote: > Hi Michal, > > On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec <mic...@gm...> > wrote: >> >> Hello, I have just noticed this: >> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]")) >> 'c1ccsc1' >> >>> >> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)) >> '[H]c1sc([H])c([H])c1[H]' >> >>> >> >>> Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))) >> 'c1ccsc1' >> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]")) >> 'c1ccsc1' >> >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False)) >> '[H]c1cscc1[H]' >> >> Is it the expected behaviour? Why does sanitization remove hydrogens? >> >> Is it controlled by any of the SanitizeFlags? > > > It is the expected behavior. When sanitization is turned on, the SMILES > parser actually calls "RemoveHs"; this removes the hydrogens from the graph > and then sanitizes the molecule. > > If you do not want the Hs removed, you can tell MolFromSmiles to skip the > sanitization (which also skips the RemoveHs) and then sanitize yourself:: > > In [3]: m=Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False) > > In [4]: Chem.SanitizeMol(m) > Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > In [5]: print Chem.MolToSmiles(m) > [H]c1sc([H])c([H])c1[H] > > I hope this helps, > -greg > |