Re: [Rdkit-discuss] Removing solvent and ions from dataset
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Francois B. <ml...@li...> - 2020-06-08 07:09:49
|
On 06/06/2020 17:33, Max Pinheiro Jr wrote: > Hi RDkit team, > > I am working on a chemically diverse dataset of smiles strings and I > need to do some preprocessing to clean a bit the data before starting > the modeling part. So I was looking for some tools or built-in > functions in RDkit to make such preprocessing by removing, for > instance, solvent (water) molecules and ions. I found the > "SaltRemover" module that may solve my problem with removing ions from > the database, but I could not find an equivalent module for the case > of solvent molecules. Does anyone know a specific tool in RDkit (or > any other python program) to make such preprocessing in the smile > strings? If so, could you please provide just a simple example of how > to do it? I will be really thankful for any help you may provide. I have used this program several times: https://github.com/flatkinson/standardiser You can try this: ``` pip3 install chemo-standardizer standardiser -i input.smi -o output_std.smi ``` I believe it uses rdkit under the hood. Regards, F. > Max Pinheiro Jr > --------------------------------------------- > Université Aix-Marseille, France > Institut de Chimie Radicalaire > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |