Re: [Rdkit-discuss] Capturing offending atom in error message
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2019-10-07 04:53:09
|
This doesn't immediately help, but it's worth mentioning the upcoming
2019.09 release has functionality that should help here:
In [18]: m = Chem.MolFromSmiles('CN(C)(C)C',sanitize=False)
In [19]: problems = Chem.DetectChemistryProblems(m)
[06:47:43] Explicit valence for atom # 1 N, 4, is greater than permitted
In [20]: len(problems)
Out[20]: 1
In [21]: problems[0].GetType()
Out[21]: 'AtomValenceException'
In [22]: problems[0].GetAtomIdx()
Out[22]: 1
In [23]: problems[0].Message()
Out[23]: 'Explicit valence for atom # 1 N, 4, is greater than permitted'
In [24]: m2 = Chem.MolFromSmiles('c1cncc1',sanitize=False)
In [25]: problems = Chem.DetectChemistryProblems(m2)
[06:48:19] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4
In [26]: len(problems)
Out[26]: 1
In [27]: problems[0].GetType()
Out[27]: 'KekulizeException'
In [28]: problems[0].GetAtomIndices()
Out[28]: (0, 1, 2, 3, 4)
In [29]: problems[0].Message()
Out[29]: "Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4\n"
For your case, since you have Hs and bonds, I would suggest directly
setting the charge on any 4-valent neutral nitrogen to +1.
One thing to also check is what the representation you are using does for
nitro groups.
-greg
On Fri, Oct 4, 2019 at 6:54 PM Chaya Stern <cha...@ch...>
wrote:
> Hello all,
>
> I am trying to create a molecule from geometry (a numpy array n_atoms x
> 3), symbols (list of atom symbols) and a connectivity map (list of list
> where each list is [atom_1_idx, atom_2_idx, bond_type]). The information
> also has all hydrogens. The following code works most of the time:
>
> from rdkit import Chem
> from rdkit.Geometry.rdGeometry import Point3D
>
> _BO_DISPATCH_TABLE = {1: Chem.BondType.SINGLE, 2: Chem.BondType.DOUBLE, 3:
> Chem.BondType.TRIPLE}
>
> conformer = Chem.Conformer(len(symbols))
>
> molecule = Chem.Mol()
> em = Chem.RWMol(molecule)
> for i, s in enumerate(symbols):
> atom = em.AddAtom(Chem.Atom(cmiles.utils._symbols[s]))
> atom_position = Point3D(geometry[i][0], geometry[i][1], geometry[i][2])
> conformer.SetAtomPosition(atom, atom_position)
>
> # Add connectivity
> for bond in connectivity:
> bond_type = _BO_DISPATCH_TABLE[bond[-1]]
> em.AddBond(bond[0], bond[1], bond_type)
>
> molecule = em.GetMol()
> Chem.SanitizeMol(molecule)
>
> However, if a molecule has a tetravalent nitrogen, the data that I have
> does not have the explicit formal charge for each atom so I get the
> following error:
>
> ValueError: Sanitization error: Explicit valence for atom # 0 N, 4, is greater than permitted
>
>
> Given that I have all the hydrogen and the total charge of the molecules, I can go in and add the charge to the problematic nitrogen and check that the total charge is still the same. But I am not sure how to capture the offending atom instance. I can get the information from parsing the error message (which is the hack I use now) but I was wondering if there is a better way to do it.
>
>
> Thank you,
>
> Chaya
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdk...@li...
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
|