Re: [Rdkit-discuss] Washing molecules in RDKit
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2011-05-13 03:11:50
|
Dear JP,
On Thu, May 12, 2011 at 12:31 PM, JP <jea...@in...> wrote:
>
> I was wondering if there was a way to wash molecules in RDKit?
Anything is possible given sufficient code. ;-)
Jokes aside, as it stands there's not much RDKit functionality for
automatically cleaning up molecules. Details below.
> By washing, the manual in MOE means (and I understand that your definition
> might vary to some degree):
>
> Replace or recalculate the molecule name
You can replace the name easily by doing :
mol.SetProp("_Name","new name")
but there's no capability to automatically generate an IUPAC name or
anything like that
>
> Disconnect simple metal salts drawn in covalent notation
This would be pretty easy to do using either a reaction or an EditableMol.
> Remove minor components (e.g. counterions and solvent molecules), optionally
> storing the removed components in an alternate field
Salt removal, which could also do solvent removal, is in
rdkit.Chem.SaltRemover. The code uses a library (defined by SMARTS) of
things to remove which is read from a text file (by default
$RDBASE/Data/Salts.txt). There's currently no ability to add the
removed pieces to an alternate field, but one could code that.
It's simpler, though less accurate, to just split the molecule into
fragments using Chem.GetMolFrags and then just keep the largest
fragment.
>
> Rebalance protonation states by deprotonating strong acids and/or
> protonating strong bases
This would be easy to do given a set of SMARTS patterns defining the
strong acids/bases.
> Add explicit hydrogen atoms, or remove safely-deletable explicit hydrogen
> atoms
Adding and removing Hs is easy, but there's currently no concept of
"safely deleteable". When you call Chem.RemoveHs it takes just about
everything (it won't remove an H from H2 or leave the molecule empty).
If you can define what you mean by safely deletable, it probably would
be easy to add.
>
> Replace the atomic coordinates with an aesthetic 2D depiction layout
yes. AllChem.Compute2DCoords
>
> Enumerate tautomers and possibly also protonation states
no
> Filter enumerated structures according to strong acid/base rules
Again, given a library of strong acid/base definitions this would be easy.
Best,
-greg
|