Re: [Rdkit-discuss] can't kekulize molecule

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 08/16/2017 03:36 PM, Greg Landrum wrote:
> Hi Shuai,
> 
> The RDKit Mol2 parser is really only validated for the atom types 
> generated by corina. I'm not surprised that the ouput from open babel 
> would not be understood. This is documented:
> http://rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromMol2File

It would be really nice if open babel MOL2 output could directly be read
in by rdkit.

I often find myself running
$ obabel in.mol2 -O out.sdf
just for that purpose.

> An aside: If you have an SDF file you can read that directly into the 
> RDKit. It seems like you shouldn't need the openbabel translation step 
> at all.
> 
> -greg
> 
> 
> On Wed, Aug 16, 2017 at 12:13 AM, David Liu <sdh...@gm... 
> <mailto:sdh...@gm...>> wrote:
> 
>     Dear all,
> 
>     I have troubles to kekulize molecule using rdkit, below is an example:
> 
>     The example.mol2 file looks like below:
> 
>     @MOLECULE
>     example
>     46 49 0 0 0
>     SMALL
>     GASTEIGER
> 
>     @ATOM
>     1 C -4.5556 -0.2844 1.1718 C.3 1 LIG1 -0.0109
>     2 C -6.0291 -0.7271 1.2334 C.3 1 LIG1 0.0493
>     3 C -6.4413 -0.5958 -1.0493 C.3 1 LIG1 0.0493
>     4 C -5.1977 0.3130 -1.1927 C.3 1 LIG1 -0.0109
>     5 C 5.5992 -2.5640 -0.8780 C.ar 1 LIG1 -0.0253
>     6 O -6.3822 -1.4588 0.0764 O.3 1 LIG1 -0.3796
>     7 C 2.8943 1.6722 0.9911 C.ar 1 LIG1 0.2664
>     8 C 5.1745 -2.0407 0.3480 C.ar 1 LIG1 0.1371
>     9 C -1.6179 0.4017 0.1577 C.ar 1 LIG1 0.2173
>     10 C -4.0573 -0.1702 -0.2838 C.3 1 LIG1 0.0275
>     11 C 0.8767 -0.2307 1.1489 C.ar 1 LIG1 0.0370
>     12 C 2.1438 -0.5325 1.6439 C.ar 1 LIG1 -0.0306
>     13 C 6.1958 -1.7294 -1.8279 C.ar 1 LIG1 -0.0590
>     14 C 6.3717 -0.3702 -1.5525 C.ar 1 LIG1 -0.0605
>     15 C 5.9487 0.1564 -0.3282 C.ar 1 LIG1 -0.0452
>     16 C 0.6358 1.0320 0.5744 C.ar 1 LIG1 0.1483
>     17 C -0.1716 -1.1537 1.2042 C.ar 1 LIG1 0.0418
>     18 C 3.1618 0.4153 1.5592 C.ar 1 LIG1 0.0780
>     19 C 5.3424 -0.6749 0.6231 C.ar 1 LIG1 0.0480
>     20 C 1.3530 3.2786 -0.1013 C.3 1 LIG1 0.0167
>     21 F 4.6032 -2.8623 1.2640 F 1 LIG1 -0.2043
>     22 S 4.7969 0.0115 2.1898 S.3 1 LIG1 -0.0812
>     23 N -1.3906 -0.8211 0.7091 N.ar 1 LIG1 -0.2222
>     24 O 3.8206 2.5277 0.9363 O.2 1 LIG1 -0.2664
>     25 N 1.6412 1.9659 0.5033 N.ar 1 LIG1 -0.2949
>     26 N -0.6088 1.3106 0.0937 N.ar 1 LIG1 -0.1964
>     27 N -2.9091 0.7394 -0.3655 N.pl3 1 LIG1 -0.3104
>     28 H -3.9262 -1.0225 1.7144 H 1 LIG1 0.0305
>     29 H -4.4544 0.6942 1.6907 H 1 LIG1 0.0305
>     30 H -6.1785 -1.3738 2.1237 H 1 LIG1 0.0560
>     31 H -6.6965 0.1565 1.3647 H 1 LIG1 0.0560
>     32 H -7.3658 0.0220 -1.0063 H 1 LIG1 0.0560
>     33 H -6.5227 -1.2302 -1.9574 H 1 LIG1 0.0560
>     34 H -4.8575 0.3261 -2.2513 H 1 LIG1 0.0305
>     35 H -5.4753 1.3532 -0.9112 H 1 LIG1 0.0305
>     36 H 5.4676 -3.6168 -1.0922 H 1 LIG1 0.0646
>     37 H -3.7461 -1.1771 -0.6436 H 1 LIG1 0.0500
>     38 H 2.3428 -1.4998 2.0895 H 1 LIG1 0.0638
>     39 H 6.5237 -2.1362 -2.7758 H 1 LIG1 0.0618
>     40 H 6.8363 0.2748 -2.2870 H 1 LIG1 0.0618
>     41 H 6.0904 1.2094 -0.1219 H 1 LIG1 0.0630
>     42 H -0.0243 -2.1352 1.6372 H 1 LIG1 0.0838
>     43 H 2.2342 3.9528 -0.1073 H 1 LIG1 0.0457
>     44 H 0.5450 3.7853 0.4685 H 1 LIG1 0.0457
>     45 H 1.0258 3.1432 -1.1544 H 1 LIG1 0.0457
>     46 H -3.0166 1.6655 -0.8392 H 1 LIG1 0.1492
>     @BOND
>     1 1 2 1
>     2 1 10 1
>     3 2 6 1
>     4 3 4 1
>     5 3 6 1
>     6 4 10 1
>     7 5 8 ar
>     8 5 13 ar
>     9 7 18 ar
>     10 7 24 2
>     11 7 25 ar
>     12 8 19 ar
>     13 8 21 1
>     14 9 23 ar
>     15 9 26 ar
>     16 9 27 1
>     17 10 27 1
>     18 11 12 ar
>     19 11 16 ar
>     20 11 17 ar
>     21 12 18 ar
>     22 13 14 ar
>     23 14 15 ar
>     24 15 19 ar
>     25 16 25 ar
>     26 16 26 ar
>     27 17 23 ar
>     28 18 22 1
>     29 19 22 1
>     30 20 25 1
>     31 1 28 1
>     32 1 29 1
>     33 2 30 1
>     34 2 31 1
>     35 3 32 1
>     36 3 33 1
>     37 4 34 1
>     38 4 35 1
>     39 5 36 1
>     40 10 37 1
>     41 12 38 1
>     42 13 39 1
>     43 14 40 1
>     44 15 41 1
>     45 17 42 1
>     46 20 43 1
>     47 20 44 1
>     48 20 45 1
>     49 27 46 1
> 
>     And the example.py code looks like
> 
>     from rdkit.Chem import AllChem
>     from rdkit import Chem
> 
>     rdkit_mol = Chem.MolFromMol2File("example.mol2", sanitize=False,
>     removeHs=False)
>     mol = AllChem.RemoveHs(rdkit_mol)
> 
>     If running the example.py, it returns an error as below:
> 
>     ValueError: Sanitization error: Can't kekulize mol. Unkekulized
>     atoms: 8 10 11 15 16 17 22 24 25
> 
>     It seems rdkit cannot understand the molecules when it try to remove
>     the hydrogens, probably related to the format of the mol2 file I
>     used here? I use openbabel to convert the mol2 file from an sdf
>     file. So I wonder if there is a plan to parse the mol2 file like
>     this or I need to further cook the mol2 file. I appreciate for any
>     advices!
> 
> 
>     Thanks,
> 
>     Shuai
> 
> 
>     ------------------------------------------------------------------------------
>     Check out the vibrant tech community on one of the world's most
>     engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>     _______________________________________________
>     Rdkit-discuss mailing list
>     Rdk...@li...
>     <mailto:Rdk...@li...>
>     https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>     <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> 
> 
> 
> _______________________________________________
> Rdkit-discuss mailing list
> Rdk...@li...
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 

Re: [Rdkit-discuss] can't kekulize molecule

Open-Source Cheminformatics and Machine Learning

Re: [Rdkit-discuss] can't kekulize molecule