Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Steven,

The PDB reader in the RDKit doesn’t determine any bond orders - everything
is read as a single bond.
In order to set the bond orders, you need to call the
AssignBondOrdersFromTemplate() function using a reference molecule
generated from SMILES (or SDF).

Here is some example code from the docs:

>>> from rdkit.Chem import AllChem
>>> template = AllChem.MolFromSmiles("CN1C(=NC(C1=O)(c2ccccc2)c3ccccc3)N")
>>> mol = AllChem.MolFromPDBFile(os.path.join(RDConfig.RDCodeDir, 'Chem',
'test_data', '4DJU_lig.pdb'))
>>> len([1 for b in template.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8
>>> len([1 for b in mol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
22

Now assign the bond orders based on the template molecule
>>> newMol = AllChem.AssignBondOrdersFromTemplate(template, mol)
>>> len([1 for b in newMol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8

Note that the template molecule should have no explicit hydrogens
else the algorithm will fail.

Hope this helps.

Best,
Sereina

2016-09-07 17:16 GMT+02:00 Steven Combs <ste...@gm...>:

> Hello!
>
> I have a pdb block that I am working with, which is attached to this
> email. The ligand has aromatic ring structures in it; however, when it is
> read into RDKit and converted into a smiles string, the aromatic rings are
> converted into aliphatic rings. Any thoughts?
>
> Here is the python code:
>
>     def extract_data( filename):
>         extracted_info = ""
>         with open(filename) as f:
>             for line in f.readlines():
>                 if "HETATM" in line:
>                     extracted_info += ( line)
>         return extracted_info
>
>     for index, filename in enumerate(solution_pdb_filenames):
>         row = extract_data( filename)
>         m = Chem.MolFromPDBBlock(row, sanitize=True, removeHs=False )
>         Chem.SetHybridization(m)
>         Chem.SetAromaticity(m)
>         Chem.SanitizeMol(m, sanitizeOps=Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)
> #not needed since sanitizing during read in, but trying to figure out if it
> actually worked
>         print ("Parsing file " + str(index) + " of " +
> str(len(solution_pdb_filenames)))
>         print (Chem.MolToSmiles(m, kekuleSmiles=True, allHsExplicit=True))
>
> The output smile string is:
>
> [H][O][CH]1[NH][CH]([C]([H])([H])[CH]([OH])[OH])[CH]([C]([H]
> )([H])[C]([H])([H])[H])[CH]([CH]([OH])[CH]2[CH]([H])[CH]([
> H])[CH]([H])[CH]([N]([H])[H])[CH]2[H])[CH]1[N]([C]([H])([H])
> [H])[C]([H])([H])[H]
>
> Steven Combs
>
>
>
> ------------------------------------------------------------
> ------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdk...@li...
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles

Open-Source Cheminformatics and Machine Learning

Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles