Re: [Rdkit-discuss] can't kekulize molecule

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Shuai,

The RDKit Mol2 parser is really only validated for the atom types generated
by corina. I'm not surprised that the ouput from open babel would not be
understood. This is documented:
http://rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromMol2File

An aside: If you have an SDF file you can read that directly into the
RDKit. It seems like you shouldn't need the openbabel translation step at
all.

-greg

On Wed, Aug 16, 2017 at 12:13 AM, David Liu <sdh...@gm...> wrote:

> Dear all,
>
> I have troubles to kekulize molecule using rdkit, below is an example:
>
> The example.mol2 file looks like below:
>
> @MOLECULE
> example
> 46 49 0 0 0
> SMALL
> GASTEIGER
>
> @ATOM
> 1 C -4.5556 -0.2844 1.1718 C.3 1 LIG1 -0.0109
> 2 C -6.0291 -0.7271 1.2334 C.3 1 LIG1 0.0493
> 3 C -6.4413 -0.5958 -1.0493 C.3 1 LIG1 0.0493
> 4 C -5.1977 0.3130 -1.1927 C.3 1 LIG1 -0.0109
> 5 C 5.5992 -2.5640 -0.8780 C.ar 1 LIG1 -0.0253
> 6 O -6.3822 -1.4588 0.0764 O.3 1 LIG1 -0.3796
> 7 C 2.8943 1.6722 0.9911 C.ar 1 LIG1 0.2664
> 8 C 5.1745 -2.0407 0.3480 C.ar 1 LIG1 0.1371
> 9 C -1.6179 0.4017 0.1577 C.ar 1 LIG1 0.2173
> 10 C -4.0573 -0.1702 -0.2838 C.3 1 LIG1 0.0275
> 11 C 0.8767 -0.2307 1.1489 C.ar 1 LIG1 0.0370
> 12 C 2.1438 -0.5325 1.6439 C.ar 1 LIG1 -0.0306
> 13 C 6.1958 -1.7294 -1.8279 C.ar 1 LIG1 -0.0590
> 14 C 6.3717 -0.3702 -1.5525 C.ar 1 LIG1 -0.0605
> 15 C 5.9487 0.1564 -0.3282 C.ar 1 LIG1 -0.0452
> 16 C 0.6358 1.0320 0.5744 C.ar 1 LIG1 0.1483
> 17 C -0.1716 -1.1537 1.2042 C.ar 1 LIG1 0.0418
> 18 C 3.1618 0.4153 1.5592 C.ar 1 LIG1 0.0780
> 19 C 5.3424 -0.6749 0.6231 C.ar 1 LIG1 0.0480
> 20 C 1.3530 3.2786 -0.1013 C.3 1 LIG1 0.0167
> 21 F 4.6032 -2.8623 1.2640 F 1 LIG1 -0.2043
> 22 S 4.7969 0.0115 2.1898 S.3 1 LIG1 -0.0812
> 23 N -1.3906 -0.8211 0.7091 N.ar 1 LIG1 -0.2222
> 24 O 3.8206 2.5277 0.9363 O.2 1 LIG1 -0.2664
> 25 N 1.6412 1.9659 0.5033 N.ar 1 LIG1 -0.2949
> 26 N -0.6088 1.3106 0.0937 N.ar 1 LIG1 -0.1964
> 27 N -2.9091 0.7394 -0.3655 N.pl3 1 LIG1 -0.3104
> 28 H -3.9262 -1.0225 1.7144 H 1 LIG1 0.0305
> 29 H -4.4544 0.6942 1.6907 H 1 LIG1 0.0305
> 30 H -6.1785 -1.3738 2.1237 H 1 LIG1 0.0560
> 31 H -6.6965 0.1565 1.3647 H 1 LIG1 0.0560
> 32 H -7.3658 0.0220 -1.0063 H 1 LIG1 0.0560
> 33 H -6.5227 -1.2302 -1.9574 H 1 LIG1 0.0560
> 34 H -4.8575 0.3261 -2.2513 H 1 LIG1 0.0305
> 35 H -5.4753 1.3532 -0.9112 H 1 LIG1 0.0305
> 36 H 5.4676 -3.6168 -1.0922 H 1 LIG1 0.0646
> 37 H -3.7461 -1.1771 -0.6436 H 1 LIG1 0.0500
> 38 H 2.3428 -1.4998 2.0895 H 1 LIG1 0.0638
> 39 H 6.5237 -2.1362 -2.7758 H 1 LIG1 0.0618
> 40 H 6.8363 0.2748 -2.2870 H 1 LIG1 0.0618
> 41 H 6.0904 1.2094 -0.1219 H 1 LIG1 0.0630
> 42 H -0.0243 -2.1352 1.6372 H 1 LIG1 0.0838
> 43 H 2.2342 3.9528 -0.1073 H 1 LIG1 0.0457
> 44 H 0.5450 3.7853 0.4685 H 1 LIG1 0.0457
> 45 H 1.0258 3.1432 -1.1544 H 1 LIG1 0.0457
> 46 H -3.0166 1.6655 -0.8392 H 1 LIG1 0.1492
> @BOND
> 1 1 2 1
> 2 1 10 1
> 3 2 6 1
> 4 3 4 1
> 5 3 6 1
> 6 4 10 1
> 7 5 8 ar
> 8 5 13 ar
> 9 7 18 ar
> 10 7 24 2
> 11 7 25 ar
> 12 8 19 ar
> 13 8 21 1
> 14 9 23 ar
> 15 9 26 ar
> 16 9 27 1
> 17 10 27 1
> 18 11 12 ar
> 19 11 16 ar
> 20 11 17 ar
> 21 12 18 ar
> 22 13 14 ar
> 23 14 15 ar
> 24 15 19 ar
> 25 16 25 ar
> 26 16 26 ar
> 27 17 23 ar
> 28 18 22 1
> 29 19 22 1
> 30 20 25 1
> 31 1 28 1
> 32 1 29 1
> 33 2 30 1
> 34 2 31 1
> 35 3 32 1
> 36 3 33 1
> 37 4 34 1
> 38 4 35 1
> 39 5 36 1
> 40 10 37 1
> 41 12 38 1
> 42 13 39 1
> 43 14 40 1
> 44 15 41 1
> 45 17 42 1
> 46 20 43 1
> 47 20 44 1
> 48 20 45 1
> 49 27 46 1
>
> And the example.py code looks like
>
> from rdkit.Chem import AllChem
> from rdkit import Chem
>
> rdkit_mol = Chem.MolFromMol2File("example.mol2", sanitize=False,
> removeHs=False)
> mol = AllChem.RemoveHs(rdkit_mol)
>
> If running the example.py, it returns an error as below:
>
> ValueError: Sanitization error: Can't kekulize mol. Unkekulized atoms: 8
> 10 11 15 16 17 22 24 25
>
> It seems rdkit cannot understand the molecules when it try to remove the
> hydrogens, probably related to the format of the mol2 file I used here? I
> use openbabel to convert the mol2 file from an sdf file. So I wonder if
> there is a plan to parse the mol2 file like this or I need to further cook
> the mol2 file. I appreciate for any advices!
>
>
> Thanks,
>
> Shuai
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdk...@li...
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

Re: [Rdkit-discuss] can't kekulize molecule

Open-Source Cheminformatics and Machine Learning

Re: [Rdkit-discuss] can't kekulize molecule