Re: [Rdkit-discuss] RDKit cannot sanitize metal atom like platinum
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2019-07-27 04:57:27
|
Hi Hongbin, The big problem here is that SMILES was not designed to support organometallics: it doesn't have a dative bond type. The RDKit uses a SMILES extension to handle these bonds ( https://www.rdkit.org/docs/RDKit_Book.html#dative-bonds), but since most toolkits don't support that extension and you don't find it in any public databases, this doesn't help you. In the case of molecules like oxaliplatin, it's pretty easy to recognize the single bonds that should be dative and "fix" them. Here's an example showing how to do that: https://gist.github.com/greglandrum/6cd7aadcdedb1ebcafa9537e8a47e3a4 I hope this helps, -greg On Fri, Jul 26, 2019 at 5:50 PM Hongbin Yang <yan...@16...> wrote: > Dear all, > > I encountered a problem when reading the molecule “Oxaliplatin”. > > In DrugBank, the SMILES of Oxaliplatin is `[H][N]1([H])[C@@H]2CCCC[C@H]2[N]([H])([H])[Pt]11OC(=O)C(=O)O1`. If > you use Chem.MolFromSmiles without denying sanitisation, it will return an > error, "Explicit valence for atom # 0 N, 4, is greater than permitted”. In > this molecule, the nitrogens have four bonds, one of which is coordination > bond. It seems that SMILES cannot present this bond type. > My question is how to read it? Sanitisation is necessary to e.g. calculate > fingerprint so skipping sanitisation is not a good idea. > [image: image.png] > > One alternative is to use ionisation. For example, the SMILES of Oxaliplatin > in PubChem is `C1CC[C@H]([C@@H](C1)[NH-])[NH-].C(=O)(C(=O)[O-])[O-].[Pt+4]`. > It transfers the covalent bonds into ionic bonds, which I think is not good > enough, but it’s OK. > > If this is the only solution, my question is how to transfer > the “incorrect” SMILES in DrugBank into the “correct” one in PubChem within > RDKit. > > PS, I though it should be a common issue but I could not find anything > similar in GitHub Issue and Mailing list history. (Maybe it is because we > are always discarding organometallic compounds ?) > > Best regards, > > Hongbin Yang 杨弘宾, Ph.D. > Research: Toxicophore and Chemoinformatics > Pharmaceutical Science, School of Pharmacy > East China University of Science and Technology > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |