Re: [Rdkit-discuss] Question on chirality
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Jan H. J. <ja...@bi...> - 2019-09-13 09:49:30
|
Hi Navid, I am not familiar with the paper you mention, but I believe that the problem is caused by non-isomeric input SMILES. Below is an Alanine read in from molfile, with coordinates. It has a chiral center with "S" configuration. When you output it as non-isomeric SMILES and read it back in, the chiral information is lost because the molecule no longer has a conformation: >>> mol = Chem.MolFromMolBlock(""" ... BIOCHEMF09131911262D ... ... 7 6 0 0 1 0 0 0 0 0999 V2000 ... 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.7145 0.4125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1.4290 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1.4209 -0.8208 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.7084 1.2417 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 ... -1.0000 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 ... 2.4290 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 ... 2 3 1 0 0 0 0 ... 1 2 1 0 0 0 0 ... 3 4 2 0 0 0 0 ... 2 5 1 1 0 0 0 ... 1 6 1 0 0 0 0 ... 3 7 1 0 0 0 0 ... M END ... """) >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Chem.MolToSmiles(mol) 'CC(N)C(=O)O' >>> mol = Chem.MolFromSmiles("CC(N)C(=O)O") >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [] >>> You can generate a conformation that produces chiral information by 3D embedding the molecule. >>> from rdkit.Chem import AllChem >>> AllChem.EmbedMolecule(mol) 0 >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Another way would be if you can get isomeric SMILES as input. Then the chiral information is right there. >>> Chem.MolToSmiles(mol, isomericSmiles = True) 'C[C*@*H](N)C(=O)O' >>> mol = Chem.MolFromSmiles("C[C@H](N)C(=O)O") >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Cheers -- Jan Holst Jensen On 2019-09-12 04:44, Navid Shervani-Tabar wrote: > Hello, > > In the paper: "Graph Networks as a Universal Machine Learning > Framework for Molecules and Crystals", authors introduce chirality as > an atom feature input to analyze QM9 dataset. I was trying to recreate > this atom feature as following > > > Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded). > > The code I used is: > > from chainer_chemistry import datasets > from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor > import GGNNPreprocessor > from rdkit import Chem > import numpy as np > > dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(), > return_smiles=True) > > for i in range(len(dataset_smiles)): > mol = Chem.MolFromSmiles(dataset_smiles[i]) > Chem.AssignAtomChiralTagsFromStructure(mol) > chiral_cc = Chem.FindMolChiralCenters(mol) > > if not len(chiral_cc) == 0: > print(chiral_cc) > > The output shows no Chiral centers for this dataset. When I use > `includeUnassigned=True`, code gives a list of tuples, but instead of > "R/S", I get "?". I was wondering if there is a mistake in my > implementation. If this is expected, any thoughts on how chirality was > assigned in the above paper? Thanks. > > Sincerely, > Navid |