Re: [Rdkit-discuss] rejoining pairs of fragments after fragmenting a molecule
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Andrew D. <da...@da...> - 2021-04-02 15:03:51
|
Hi Ling, > On Apr 2, 2021, at 16:23, Ling Chan <lin...@gm...> wrote: > > Thank you Francois, I took a look at your code and borrowed parts of it to rejoin two molecules. It seems like my problem is solved. I eventually arrived at something like example 4 in > https://www.programcreek.com/python/example/123334/rdkit.Chem.CombineMols > (which I discovered a bit late). > > Still, I am not sure if the code is safe. In particular, I wonder if the following conditions are always valid. > • Chem.CombineMols simply concatenates the atomic indices from the input molecules. > • The Chem.EditableMol constructor preserves atom ordering from the input. > • RemoveAtom in EditableMol results in all indices above the deleted to decrease by one, i.e. atom ordering is preserved. I've found that it's very hard to work with molecular graphs and preserve stereochemistry. Consider F/C=C/Cl breaking on the first bond, and the code I pointed you to. FragmentOnBonds() using '9' as the labels gives: [9*]/C=C/Cl.[9*]F My "smiles_weld" code converts that to: CC\%99=C/Cl.F%99 which can be re-canonicalized to the original: F/C=C/Cl . Or, with F[C@H](Cl)Br again, breaking on the first bond. FragmentOnBonds() gives [9*]F.[9*][C@H](Cl)Br smiles_weld converts that to F%99.[C@@H]%99(Cl)Br which is re-canonicalized as F[C@H](Cl)Br Handling this correctly in the molecule API requires paying careful attention to the bond direction, and bond attachment order around the atom, which changes with RemoveAtom() calls. I didn't see stereochemistry support in Francois's "bind_molecules()" nor in the connect_mols() at https://github.com/molecularsets/moses/blob/master/moses/baselines/combinatorial.py (one of the examples from the programcreek.com link you gave). If you don't need to support or preserve stereochemistry, then of course there's no problem. Cheers, Andrew da...@da... |