Re: [Rdkit-discuss] sanitization removes Hs - is this expected?
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2014-02-25 04:24:19
|
Hi Michal, On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec <mic...@gm...>wrote: > Hello, I have just noticed this: > >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]")) > 'c1ccsc1' > >>> > Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False)) > '[H]c1sc([H])c([H])c1[H]' > >>> > Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False))) > 'c1ccsc1' > >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]")) > 'c1ccsc1' > >>> Chem.MolToSmiles(Chem.MolFromSmiles("[H]c1cscc1[H]",sanitize=False)) > '[H]c1cscc1[H]' > > Is it the expected behaviour? Why does sanitization remove hydrogens? Is it controlled by any of the SanitizeFlags? > It is the expected behavior. When sanitization is turned on, the SMILES parser actually calls "RemoveHs"; this removes the hydrogens from the graph and then sanitizes the molecule. If you do not want the Hs removed, you can tell MolFromSmiles to skip the sanitization (which also skips the RemoveHs) and then sanitize yourself:: In [3]: m=Chem.MolFromSmiles("[H]c1c([H])sc([H])c1[H]",sanitize=False) In [4]: Chem.SanitizeMol(m) Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE In [5]: print Chem.MolToSmiles(m) [H]c1sc([H])c([H])c1[H] I hope this helps, -greg |