Re: [Rdkit-discuss] Explicit valence error when reading sdf files
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: S.L. C. <sl...@ya...> - 2014-07-12 07:35:39
|
Hello Wendy,
But it does look like that your molecule is unhealthy. Atom no. 7 is a neutral nitrogen and it has 4 bonds. It is possible that you could silence RDKit in some way and force it to produce some answer, but I think you should really fix the molecule.
It seems that you just need to change the bond between N7 and C4 to single and change the bond between C4 and C2 to double.
Ling
________________________________
From: Wendy Carande <wca...@gm...>
To: rdk...@li...
Sent: Friday, July 11, 2014 2:41 PM
Subject: [Rdkit-discuss] Explicit valence error when reading sdf files
Hello,
I looked through the archives and found similar problems, but couldn't find an exact solution for my case. Apologies if I missed the solution somewhere.
I'm reading through a list of sdf files and calculating descriptors for each compound. For some (but not all) of the files, I get an error (see below) about specific valence. I know that this is an error having to do with 'sanitizing' the molecule, but I'm not sure exactly how to proceed.
I've included a clip from my python script, the error, and one of the sdf files that is causing the problem.
Any ideas on how to fix this?
--Wendy
---------------------------------------
Excerpt from script
---------------------------------------
stringWithMolData = file('10104489.sdf','r').read()mol = Chem.MolFromMolBlock(stringWithMolData)
AllChem.ComputeGasteigerCharges(mol)
---------------------------------------
Error message
---------------------------------------
Explicit valence for atom # 4 N, 4, is greater than permitted
Pre-condition Violation
bad molecule
Violation occurred on line 81 in file /tmp/rdkit-HNQR/rdkit-Release_2014_03_1/Code/GraphMol/PartialCharges/GasteigerCharges.cpp
Failed Expression: mol
---------------------------------------
10104489.sdf file contents
---------------------------------------
10104489
TRC 05231419153D
PM6 optimization, min free energy conformation
14 14 0 0 0 0 0 0 0 0999 V2000
-0.4307 2.0889 0.2792 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0407 1.1071 0.2148 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4008 0.9484 0.5227 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6973 -0.0195 -0.1759 C 0 0 0 0 0 0 0 0 0 0 0 0
1.9941 1.8122 0.8291 H 0 0 0 0 0 0 0 0 0 0 0 0
1.9923 -0.3134 0.4365 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.1378 -1.2635 -0.2668 N 0 0 0 0 0 0 0 0 0 0 0 0
-2.1710 0.0301 -0.5321 C 0 0 0 0 0 0 0 0 0 0 0 0
3.0439 -0.4753 0.6673 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1631 -1.3724 0.0355 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.8766 0.5689 0.4954 F 0 0 0 0 0 0 0 0 0 0 0 0
-2.3775 0.9405 -1.5182 F 0 0 0 0 0 0 0 0 0 0 0 0
-2.6216 -0.9493 -0.8245 H 0 0 0 0 0 0 0 0 0 0 0 0
1.6684 -3.1599 -0.1690 Br 0 0 0 0 0 0 0 0 0 0 0 0
2 1 1 0 0 0 0
2 3 1 0 0 0 0
3 5 1 0 0 0 0
4 2 1 0 0 0 0
6 3 2 0 0 0 0
6 9 1 0 0 0 0
7 4 2 0 0 0 0
7 10 2 0 0 0 0
8 4 1 0 0 0 0
8 11 1 0 0 0 0
10 6 1 0 0 0 0
12 8 1 0 0 0 0
13 8 1 0 0 0 0
14 10 1 0 0 0 0
M RAD 1 2 2
M END
> <Symmetry>
Cs(1)
> <Energy>
-76.2319210753677
> <FreeEnergy>
-105.218958525368
> <Freq>
36.38271.8490
118.95280.6797
121.32871.7880
146.74223.2258
230.25474.6726
300.57025.7138
328.80191.4348
361.61170.5034
402.58230.1183
552.499520.1778
573.75780.7088
621.420713.4753
682.935327.8339
701.66185.4059
844.339676.4557
881.511251.8745
935.00090.0366
986.51354.6250
1020.121312.4436
1073.257827.3170
1132.005517.4835
1149.05085.7188
1174.10693.8183
1193.890314.3170
1225.83613.1755
1250.014694.6662
1258.068925.6122
1333.4544115.3666
1444.906096.2140
1474.63920.2878
1604.561058.4422
1630.974234.4613
2636.622277.6239
2737.586021.2480
2745.9489233.0648
2756.3587214.6328
> <gAAFreq>
1046.34084166667
> <gAlpha>
72.95
> <gCOSMO_DPSA1>
-16.3678412270001
> <gCOSMO_DPSA2>
-8.0618952540101
> <gCOSMO_NCD>
-0.00499623410644148
> <gCOSMO_NEG>
-0.492544809359508
> <gCOSMO_PCD>
0.00599090900680561
> <gCOSMO_PNSA1>
98.5832126490001
> <gCOSMO_PNSA2>
-48.5566496802496
> <gCOSMO_POS>
0.492544809149928
> <gCOSMO_PPSA1>
82.215371422
> <gCOSMO_PPSA2>
40.4947544262395
> <gCOSMO_SA>
180.798584071
> <gCOSMO_SKW>
0.546436666002523
> <gCOSMO_VAR>
0.00692393043054713
> <gCOSMO_Vol>
167.141981607954
> <gCvib>
12.6954699037459
> <gDPSA1>
-65.71527629439
> <gDPSA2>
-76.6311751243748
> <gDipole>
2.8934
> <gEN>
-5.447286492133e+02
> <gEnergy>
-76.2319210753677
> <gGAFreq>
745.714161110022
> <gHLgap>
0.33918
> <gHOMO>
-0.37718
> <gIRAFreq>
1836.33033030781
> <gKE>
-3.721651577705e+01
> <gLUMO>
-0.03800
> <gNCD>
-0.0105562568746138
> <gNEG>
-1.166109
> <gNN>
3.105243637196e+02
> <gPCD>
0.0260577959728786
> <gPNSA1>
110.4661447567
> <gPNSA2>
-128.815565596091
> <gPOS>
1.166109
> <gPPSA1>
44.75086846231
> <gPPSA2>
52.1843904717159
> <gSumIR>
1276.4724
> <gVol>
177.358428355517
> <gvdWSA>
155.217013219
> <gvdWV>
120.158277876
$$$$
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdk...@li...
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |