Re: [Rdkit-discuss] can't kekulize molecule
Open-Source Cheminformatics and Machine Learning
                
                Brought to you by:
                
                    glandrum
                    
                
            
            
        
        
        
    | 
      
      
      From: Greg L. <gre...@gm...> - 2017-08-16 06:37:00
      
     | 
| Hi Shuai, The RDKit Mol2 parser is really only validated for the atom types generated by corina. I'm not surprised that the ouput from open babel would not be understood. This is documented: http://rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromMol2File An aside: If you have an SDF file you can read that directly into the RDKit. It seems like you shouldn't need the openbabel translation step at all. -greg On Wed, Aug 16, 2017 at 12:13 AM, David Liu <sdh...@gm...> wrote: > Dear all, > > I have troubles to kekulize molecule using rdkit, below is an example: > > The example.mol2 file looks like below: > > @MOLECULE > example > 46 49 0 0 0 > SMALL > GASTEIGER > > @ATOM > 1 C -4.5556 -0.2844 1.1718 C.3 1 LIG1 -0.0109 > 2 C -6.0291 -0.7271 1.2334 C.3 1 LIG1 0.0493 > 3 C -6.4413 -0.5958 -1.0493 C.3 1 LIG1 0.0493 > 4 C -5.1977 0.3130 -1.1927 C.3 1 LIG1 -0.0109 > 5 C 5.5992 -2.5640 -0.8780 C.ar 1 LIG1 -0.0253 > 6 O -6.3822 -1.4588 0.0764 O.3 1 LIG1 -0.3796 > 7 C 2.8943 1.6722 0.9911 C.ar 1 LIG1 0.2664 > 8 C 5.1745 -2.0407 0.3480 C.ar 1 LIG1 0.1371 > 9 C -1.6179 0.4017 0.1577 C.ar 1 LIG1 0.2173 > 10 C -4.0573 -0.1702 -0.2838 C.3 1 LIG1 0.0275 > 11 C 0.8767 -0.2307 1.1489 C.ar 1 LIG1 0.0370 > 12 C 2.1438 -0.5325 1.6439 C.ar 1 LIG1 -0.0306 > 13 C 6.1958 -1.7294 -1.8279 C.ar 1 LIG1 -0.0590 > 14 C 6.3717 -0.3702 -1.5525 C.ar 1 LIG1 -0.0605 > 15 C 5.9487 0.1564 -0.3282 C.ar 1 LIG1 -0.0452 > 16 C 0.6358 1.0320 0.5744 C.ar 1 LIG1 0.1483 > 17 C -0.1716 -1.1537 1.2042 C.ar 1 LIG1 0.0418 > 18 C 3.1618 0.4153 1.5592 C.ar 1 LIG1 0.0780 > 19 C 5.3424 -0.6749 0.6231 C.ar 1 LIG1 0.0480 > 20 C 1.3530 3.2786 -0.1013 C.3 1 LIG1 0.0167 > 21 F 4.6032 -2.8623 1.2640 F 1 LIG1 -0.2043 > 22 S 4.7969 0.0115 2.1898 S.3 1 LIG1 -0.0812 > 23 N -1.3906 -0.8211 0.7091 N.ar 1 LIG1 -0.2222 > 24 O 3.8206 2.5277 0.9363 O.2 1 LIG1 -0.2664 > 25 N 1.6412 1.9659 0.5033 N.ar 1 LIG1 -0.2949 > 26 N -0.6088 1.3106 0.0937 N.ar 1 LIG1 -0.1964 > 27 N -2.9091 0.7394 -0.3655 N.pl3 1 LIG1 -0.3104 > 28 H -3.9262 -1.0225 1.7144 H 1 LIG1 0.0305 > 29 H -4.4544 0.6942 1.6907 H 1 LIG1 0.0305 > 30 H -6.1785 -1.3738 2.1237 H 1 LIG1 0.0560 > 31 H -6.6965 0.1565 1.3647 H 1 LIG1 0.0560 > 32 H -7.3658 0.0220 -1.0063 H 1 LIG1 0.0560 > 33 H -6.5227 -1.2302 -1.9574 H 1 LIG1 0.0560 > 34 H -4.8575 0.3261 -2.2513 H 1 LIG1 0.0305 > 35 H -5.4753 1.3532 -0.9112 H 1 LIG1 0.0305 > 36 H 5.4676 -3.6168 -1.0922 H 1 LIG1 0.0646 > 37 H -3.7461 -1.1771 -0.6436 H 1 LIG1 0.0500 > 38 H 2.3428 -1.4998 2.0895 H 1 LIG1 0.0638 > 39 H 6.5237 -2.1362 -2.7758 H 1 LIG1 0.0618 > 40 H 6.8363 0.2748 -2.2870 H 1 LIG1 0.0618 > 41 H 6.0904 1.2094 -0.1219 H 1 LIG1 0.0630 > 42 H -0.0243 -2.1352 1.6372 H 1 LIG1 0.0838 > 43 H 2.2342 3.9528 -0.1073 H 1 LIG1 0.0457 > 44 H 0.5450 3.7853 0.4685 H 1 LIG1 0.0457 > 45 H 1.0258 3.1432 -1.1544 H 1 LIG1 0.0457 > 46 H -3.0166 1.6655 -0.8392 H 1 LIG1 0.1492 > @BOND > 1 1 2 1 > 2 1 10 1 > 3 2 6 1 > 4 3 4 1 > 5 3 6 1 > 6 4 10 1 > 7 5 8 ar > 8 5 13 ar > 9 7 18 ar > 10 7 24 2 > 11 7 25 ar > 12 8 19 ar > 13 8 21 1 > 14 9 23 ar > 15 9 26 ar > 16 9 27 1 > 17 10 27 1 > 18 11 12 ar > 19 11 16 ar > 20 11 17 ar > 21 12 18 ar > 22 13 14 ar > 23 14 15 ar > 24 15 19 ar > 25 16 25 ar > 26 16 26 ar > 27 17 23 ar > 28 18 22 1 > 29 19 22 1 > 30 20 25 1 > 31 1 28 1 > 32 1 29 1 > 33 2 30 1 > 34 2 31 1 > 35 3 32 1 > 36 3 33 1 > 37 4 34 1 > 38 4 35 1 > 39 5 36 1 > 40 10 37 1 > 41 12 38 1 > 42 13 39 1 > 43 14 40 1 > 44 15 41 1 > 45 17 42 1 > 46 20 43 1 > 47 20 44 1 > 48 20 45 1 > 49 27 46 1 > > And the example.py code looks like > > from rdkit.Chem import AllChem > from rdkit import Chem > > rdkit_mol = Chem.MolFromMol2File("example.mol2", sanitize=False, > removeHs=False) > mol = AllChem.RemoveHs(rdkit_mol) > > If running the example.py, it returns an error as below: > > ValueError: Sanitization error: Can't kekulize mol. Unkekulized atoms: 8 > 10 11 15 16 17 22 24 25 > > It seems rdkit cannot understand the molecules when it try to remove the > hydrogens, probably related to the format of the mol2 file I used here? I > use openbabel to convert the mol2 file from an sdf file. So I wonder if > there is a plan to parse the mol2 file like this or I need to further cook > the mol2 file. I appreciate for any advices! > > > Thanks, > > Shuai > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |