Re: [Rdkit-discuss] Nitrogen Valence
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Yuran W. <wan...@gm...> - 2017-05-11 17:46:39
|
Hi Peter, Thank you for your reply. I did not quite understand what you mean by 'But this makes no sense'. Also the SMILES you tested are zwitterionic form. In this link http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization, the zwitterionic form seems suitable for N=O, N#N, not for N=N. But I may just have a very limited knowledge of RDkit. This is how it looks like in ChemDraw: [image: Inline image 1] Thanks, Yuran On Thu, May 11, 2017 at 1:33 PM, Peter S. Shenkin <sh...@gm...> wrote: > The problematic part is just the beginning of your would-be SMILES: > N=N(C)(C)C. The rest is correctly parsed. But this makes no sense. Perhaps > you mean one of the substructures illustrated in the attached (which at > least satisfy normal valence rules). If not, perhaps you could attach a > structural diagram of what you do mean. > > -P. > > > On Thu, May 11, 2017 at 11:02 AM, Yuran Wang <wan...@gm...> > wrote: > >> Dear Greg, >> Thank you very much for the suggestions. It works for me! >> Here is the SMILES of one molecule that I am looking >> at: N=N(C)(C)CC(CN1N=CN=C1)(O)C2=C(C=C(C=C2)F)F >> Any better alternative will be appreciated. >> >> Thanks, >> Yuran >> >> On Thu, May 11, 2017 at 10:49 AM, Greg Landrum <gre...@gm...> >> wrote: >> >>> >>> >>> On Thu, May 11, 2017 at 4:24 PM, Yuran Wang <wan...@gm...> >>> wrote: >>> >>>> I have a question regarding the available valence of Nitrogen. It seems >>>> only 3 is available in the default setting (atomic_data.cpp). Why is it >>>> kept to only 3, and not extended to include 4 and 5? If I change it locally >>>> to include 4 and 5, will it cause any problems? >>>> >>> >>> Aside from generating molecules that don't make any chemical sense? >>> Probably not, but the lack of chemical sense may cause some unexpected >>> behavior. >>> >>> >>>> I am aware that I could turn off the sanitization to get a mol object, >>>> however, it cannot be further processed to get fingerprints, which is what >>>> I need. >>>> >>> >>> Well, you could turn off the sanitization on molecule construction and >>> then manually sanitize with the valence check turned off. Here's a simple >>> example of that: >>> >>> In [11]: m = Chem.MolFromSmiles('CN(C)(C)(C)C',sanitize=False) >>> >>> In [12]: m.UpdatePropertyCache(strict=False) >>> >>> In [13]: Chem.SanitizeMol(m,Chem.SANITIZE_SYMMRINGS|Chem.SANITIZE_SET >>> CONJUGATION|Chem.SANITIZE_SETHYBRIDIZATION) >>> Out[13]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE >>> >>> In [14]: rdMolDescriptors.GetMorganFingerprint(m,2) >>> Out[14]: <rdkit.DataStructs.cDataStructs.UIntSparseIntVect at >>> 0x10b0ab350> >>> >>> >>> But, again, the RDKit's valence rules tend to reflect real chemistry. >>> What are you trying to represent that you need 5 coordinate neutral >>> nitrogen atoms? There may be a better way. >>> >>> -greg >>> >>> >> >> >> >> -- >> Best, >> Yuran Wang >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Best, Yuran Wang |