Re: [Rdkit-discuss] elimination of small fragments
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Andrew D. <da...@da...> - 2018-06-29 10:01:46
|
On Jun 28, 2018, at 22:08, Paolo Tosco <pao...@gm...> wrote: > if you wish to keep only the largest disconnected fragment you may try the following: > > mols = list(rdmolops.GetMolFrags(mol, asMols = True)) > if (mols): > mols.sort(reverse = True, key = lambda m: m.GetNumAtoms()) > mol = mols[0] A somewhat simpler .. or at least shorter ... version is: mols = rdmolops.GetMolFrags(mol, asMols = True) mol = max(mols, default=mol, key=lambda m: m.GetNumAtoms()) The max() function goes through the molecules that GetMolFrag returns. If the list is empty, it returns the 'default' value, which is the original molecule. (This is what Paolo's code does. Another option is to use None as the default value.) Otherwise, since 'key' is specified, its value is used as a function to determine a value for each molecule. That is, for each term 'm' in the list of 'mols', it computes m.GetNumAtoms(), and uses that return value to select an object with the maximum value. In this case, it selects a molfrag output molecule with the most atoms. I think I've just added a topic to cover for the upcoming Python/RDKit training session in September! :) For those interested, remember to sign up soon. Cheers, Andrew da...@da... |