Re: [Rdkit-discuss] Question on chirality
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Jan H. J. <ja...@bi...> - 2019-09-13 09:49:30
|
Hi Navid,
I am not familiar with the paper you mention, but I believe that the
problem is caused by non-isomeric input SMILES.
Below is an Alanine read in from molfile, with coordinates. It has a
chiral center with "S" configuration. When you output it as non-isomeric
SMILES and read it back in, the chiral information is lost because the
molecule no longer has a conformation:
>>> mol = Chem.MolFromMolBlock("""
... BIOCHEMF09131911262D
...
... 7 6 0 0 1 0 0 0 0 0999 V2000
... 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
... 0.7145 0.4125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
... 1.4290 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
... 1.4209 -0.8208 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
... 0.7084 1.2417 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
... -1.0000 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
... 2.4290 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
... 2 3 1 0 0 0 0
... 1 2 1 0 0 0 0
... 3 4 2 0 0 0 0
... 2 5 1 1 0 0 0
... 1 6 1 0 0 0 0
... 3 7 1 0 0 0 0
... M END
... """)
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
[(1, 'S')]
>>> Chem.MolToSmiles(mol)
'CC(N)C(=O)O'
>>> mol = Chem.MolFromSmiles("CC(N)C(=O)O")
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
[]
>>>
You can generate a conformation that produces chiral information by 3D
embedding the molecule.
>>> from rdkit.Chem import AllChem
>>> AllChem.EmbedMolecule(mol)
0
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
[(1, 'S')]
>>>
Another way would be if you can get isomeric SMILES as input. Then the
chiral information is right there.
>>> Chem.MolToSmiles(mol, isomericSmiles = True)
'C[C*@*H](N)C(=O)O'
>>> mol = Chem.MolFromSmiles("C[C@H](N)C(=O)O")
>>> Chem.FindMolChiralCenters(mol)
[(1, 'S')]
>>>
Cheers
-- Jan Holst Jensen
On 2019-09-12 04:44, Navid Shervani-Tabar wrote:
> Hello,
>
> In the paper: "Graph Networks as a Universal Machine Learning
> Framework for Molecules and Crystals", authors introduce chirality as
> an atom feature input to analyze QM9 dataset. I was trying to recreate
> this atom feature as following
>
> > Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded).
>
> The code I used is:
>
> from chainer_chemistry import datasets
> from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor
> import GGNNPreprocessor
> from rdkit import Chem
> import numpy as np
>
> dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(),
> return_smiles=True)
>
> for i in range(len(dataset_smiles)):
> mol = Chem.MolFromSmiles(dataset_smiles[i])
> Chem.AssignAtomChiralTagsFromStructure(mol)
> chiral_cc = Chem.FindMolChiralCenters(mol)
>
> if not len(chiral_cc) == 0:
> print(chiral_cc)
>
> The output shows no Chiral centers for this dataset. When I use
> `includeUnassigned=True`, code gives a list of tuples, but instead of
> "R/S", I get "?". I was wondering if there is a mistake in my
> implementation. If this is expected, any thoughts on how chirality was
> assigned in the above paper? Thanks.
>
> Sincerely,
> Navid
|