Re: [Rdkit-discuss] nitro-compounds from smarts
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2017-05-04 06:01:20
|
Hi Rafal, The MolFromSmiles() code is intended to create a "normal" molecule and as part of doing that some sanitization work is carried out ( http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization). Part of that includes standardizing nitro groups (as well as a few other functional groups). MolFromSmarts() is intended to create a query molecule. The assumption is that you build a query that expresses what you're looking for, so no sanitization is done. The simplest example of this is that Chem.MolFromSmarts('c1ccccc1') and Chem.MolFromSmarts('C1=CC=CC=C1') generate queries that correspond to completely different molecules even though Chem.MolToSmiles('c1ccccc1') and Chem.MolFromSmiles('C1=CC=CC=C1') produce the same molecule. So, the short answer to your question is: "yes". You will need to rewrite your SMARTS queries so that nitro groups are expressed consistently. If you want those queries to match what the RDKit's molecule processing code produces, the nitro's themselves should be written '[N+](=O)[O-]' Best, -greg On Wed, May 3, 2017 at 10:06 PM, Rafal Roszak <rmr...@gm...> wrote: > hi all, > > Is it possible to read smarts with nitro group in form 'N(=O)(=O)'? > > Here is the context of my question: > The example below shows that there is some problem with nitro group and > MolFromSmarts: > > >>> m1=Chem.MolFromSmiles('c1ccccc1[N+](=O)[O-]') > >>> m2=Chem.MolFromSmiles('c1ccccc1N(=O)(=O)') > >>> m11=Chem.MolFromSmarts('c1ccccc1[N+](=O)[O-]') > >>> m22=Chem.MolFromSmarts('c1ccccc1N(=O)(=O)') > >>> Chem.MolToSmiles(m1) > 'O=[N+]([O-])c1ccccc1' > >>> Chem.MolToSmiles(m2) > 'O=[N+]([O-])c1ccccc1' > >>> Chem.MolToSmiles(m22) > 'O=N(=O)c1ccccc1' > >>> Chem.MolToSmiles(m11) > 'O=N(O)c1ccccc1' > > Function MolFromSmiles accept both form ('[N+](=O)[O-]' and 'N(=O) > (=O)') and return the same canonical smiles. This make sense to me. > The behaviour of MolFromSmarts is strange, nitro in form '[N+](=O) > [O-]' is correctly detected but return different canonical smiles (why? > ). However 'N(=O)(=O)' form is interpreted incorectly. > Is this bug or feature? > I have a lot of smarts which can have nitro group written either > as 'N(=O)(=O)' or '[N+](=O)[O-]'. What should I do to correctly read > them all? Replace all 'N(=O)(=O)' string (and all possible variations) > to ionic form? > > Best, > > Rafał > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |