Re: [Rdkit-discuss] MCS on fused ring systems
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2016-06-21 13:20:30
|
You are absolutely correct. There's a typo in that function. I'll fix it. -greg On Mon, Jun 20, 2016 at 5:43 PM, Luca Fenu - Network < fen...@ne...> wrote: > Thanks Greg, > > > > that works great. I’ve reconverted the isotopic smarts to smiles with the > function suggested in ( > http://rdkit.org/docs/Cookbook.html#using-custom-mcs-atom-types): > > > > *def* getMCSSmiles(mol,labelledMol,mcs): > > mcsp = Chem.MolFromSmarts(mcs.smartsString) > > match = labelledMol.GetSubstructMatch(mcsp) > > *return* Chem.MolFragmentToSmiles(ms[0],atomsToUse=match, > > isomericSmiles=*True*, > > canonical=*False*) > > > > However, I think there may be a typo, and it should be: > > > > *def* getMCSSmiles(mol,labelledMol,mcs): > > mcsp = Chem.MolFromSmarts(mcs.smartsString) > > match = labelledMol.GetSubstructMatch(mcsp) > > *return* Chem.MolFragmentToSmiles(*mol*,atomsToUse=match, > > isomericSmiles=*True*, > > canonical=*False*) > > > > I’ll see if I can use the custom atom types to get rid of more unwanted > matches, such as those you highlighted in my previous email. > > > > Thanks again, > > > > *Luca Fenu – Computational Chemist* > > Computational Chemist, CADD Group, Chemistry > Eli Lilly and Company Limited > Erl Wood Manor, Sunninghill Road, Windlesham, Surrey, GU20 6PH > 012.764.83354 (office) | 07858.701.646 (mobile) > fen...@ne... <sur...@ne...> | > www.lilly.co.uk > > > > [image: cid:image005.jpg@01D0842C.DF674960] > > > > *From:* Greg Landrum [mailto:gre...@gm...] > *Sent:* 18 June 2016 05:16 > *To:* Luca Fenu - Network > *Cc:* rdk...@li... > *Subject:* Re: [Rdkit-discuss] MCS on fused ring systems > > > > Hi Luca, > > > > On Fri, Jun 17, 2016 at 5:28 PM, Luca Fenu - Network < > fen...@ne...> wrote: > > > > I’m back again with what seems to me like puzzling behaviour from the > rdfMCS.findMCS function when dealing with fused ring systems: > > > > > mol3RA = Chem.MolFromSmiles('c1ccc2cc3ccccc3cc2c1') > > > mol3RB = Chem.MolFromSmiles('c1ccc2c(c1)ccc1ccccc21') > > > rdFMCS.FindMCS([mol3RA, mol3RB], matchValences=True, > > matchChiralTag=True, > > ringMatchesRingOnly=True, > > completeRingsOnly=True # False will return the same result > > ).smartsString > > '[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1' > > Or, in pictures: > > The MCS shown above is the same whether or not I turn on the > ‘completeRingsOnly flag. > > So, the atoms are all there, but the MCS ring can’t be closed because the > attachment point doesn’t match. > > Since we’re requesting completeRingsOnly, wouldn’t make sense for the MCS > be reduced to just a naphthalene? > > What’s the rationale behind the current implementation? Is there a way to > force it to drop the open circle? > > > > That's a bug in the current implementation. I'll take a look at it and see > if I can track it down. > > > > > > And another, related, question: is there a way to make the MCS computation > even more restrictive, > > only returning a ring as part of the MCS *if the whole fused system is > matched*, not just a portion (albeit a complete ring by itself?). In the > example described above, there would be no MCS. > > > > I’d be fine with some post-processing to the same aim. Another example: > > > > > > That's an interesting question. I think you can get at least part of what > you're looking for using the "atom labeling" trick from here: > http://rdkit.org/docs/Cookbook.html#using-custom-mcs-atom-types > > Here's a pass at the first step of that: > > > > In [70]: def label(a): > > return 100*a.GetOwningMol().GetRingInfo().NumAtomRings(a.GetIdx()) + > a.GetAtomicNum() > > ....: > > > > In [71]: bz = Chem.MolFromSmiles('c1ccccc1') > > > > In [72]: napth = Chem.MolFromSmiles('c1ccc2ccccc2c1') > > > > In [73]: rdFMCS.FindMCS([bz,napth]).smartsString > > Out[73]: '[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1' > > > > In [74]: nms = [Chem.Mol(x) for x in (bz,napth)] > > > > In [75]: for nm in nms: > > ....: for at in nm.GetAtoms(): > > ....: at.SetIsotope(label(at)) > > ....: > > > > In [76]: mcs = > rdFMCS.FindMCS(nms,atomCompare=rdFMCS.AtomCompare.CompareIsotopes) > > > > In [77]: mcs.smartsString > > Out[77]: '[106*](:[106*]:[106*]):[106*]' > > > > In [78]: mcs = > rdFMCS.FindMCS(nms,atomCompare=rdFMCS.AtomCompare.CompareIsotopes,completeRingsOnly=True) > > > > In [79]: mcs.smartsString > > Out[79]: '' > > > > Does that help? > > > > -greg > > > > > |