Hi RDKit Community,
Is there a way to preserve undefined stereochemistry aka unspecified stereochemistry when doing MolFromSmiles?
I'm working with a bunch of molecules, some with stereochemistry defined, some without.
If stereochemistry is undefined in the SMILES, I would like it to stay that way when converted to a Mol, but this doesn't seem to be the case:
> mol = Chem.MolFromSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)C(=CC(C(=O)O)Br)CC(=O)O')
> mol
[https://owa.uni.lu/owa/service.svc/s/GetFileAttachment?id=AAMkAGZmYjQwYmQ2LTcxODYtNDNhYi1hNTZlLTFiNDgxODA0MjNiZQBGAAAAAADhez7GVLyNT6vooKL2ihHhBwBuSX%2BNSPCHQainUEFyygsfAAAB%2B4B1AABuSX%2BNSPCHQainUEFyygsfAAGQzO9iAAABEgAQACo4Qhn9gSVGjyknvlrNy9g%3D&X-OWA-CANARY=KzXvJGD5S0GSEPfNkS5fZYDFe7bcdNgIObv5ckhjF4wefmj-g3q1TT_E6gcW1r5xr5EjBUEwMBo.&isImagePreview=True]
One would expect that C=C to either be crossed, as in PubChem's depiction:
https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure
[https://lh6.googleusercontent.com/qcj3x-KsughszG8tryquO6V-VDfqWT0oNF-LfA0jHbbue2pSzA69HqOAWsa_34FYyxQKfTdJv6gWeIsXW-hhNglMy4_rpf6l_x-Y3ufGRpuz_c1ZCK69k4VKVmE1Cq93rhdD7a7ij8U]<https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure>
or that single bond to be squiggly, as in CDK's depiction:
[https://www.simolecule.com/cdkdepict/depict/bow/svg?smi=CC(C)(C1%3DCC(%3DC(C(%3DC1)Br)O)Br)C(%3DCC(C(%3DO)O)Br)CC(%3DO)O&w=80&h=50&abbr=on&hdisp=bridgehead&showtitle=false&zoom=1.6&annotate=none]
But it's not just a matter of depiction, as it seems internally, mol is equivalent to its stereochem-specific sibling (Entgegen form)
CC(C)(C1=CC(=C(C(=C1)Br)O)Br)/C(=C/C(C(=O)O)Br)/CC(=O)O
I've tried sanitize=False, but it doesn't seem to have any effect. I would prefer not having to manually SetStereo(Chem.BondStereo.STEREOANY) for every molecule with undefined stereochem (not sure how I would even go about that...).
Possibly related to:
https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570
<https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>
<https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>
https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAHOi4k3revAu-9qhFt0MpUpr0aADQ9d8bV2XT6FurTEKimCQng%40mail.gmail.com/#msg36365128
o = Chem.MolFromSmiles('C/C=C/C')
https://www.rdkit.org/docs/source/rdkit.Chem.EnumerateStereoisomers.html
https://github.com/openforcefield/openforcefield/issues/146
Any help would be much appreciated.
Thanks,
Adelene
Doctoral Researcher
Environmental Cheminformatics
UNIVERSITÉ DU LUXEMBOURG
LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
6, avenue du Swing, L-4367 Belvaux
T +356 46 66 44 67 18
[github.png] adelenelai
|