Re: [Rdkit-discuss] Leaky Memory?
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2014-06-11 08:26:55
|
The RECAP code currently generates a hierarchy tree for the molecule. The
size of this tree scales very non-linearly with the number of fragments.
That molecule has a huge number of fragments.
I don't think the RECAP code will work for you as written.
What are you trying to get out of the analysis? There may be another
approach that will work,
-greg
On Wed, Jun 11, 2014 at 3:25 AM, Nicholas Firth <Nic...@ic...>
wrote:
>
>
> I think I have found part of the problem, I tried it on a single processor
> last night and didn't get past the second molecule. The script hangs on
> this molecule.
>
> >>> from rdkit import Chem
> >>> from rdkit.Chem import Recap
> >>> mol = Chem.MolFromSmiles('CC[C@H](C)[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@
> @H](NC(=O)[C@@H](N)CCSC)[C@@H](C)O)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@
> @H](C)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)N[C@
> @H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@
> @H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@
> @H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)N[C@
> @H](CC(C)C)C(=O)NCC(=O)N2CCC[C@H]2C(=O)N3CCC[C@H]3C(=O)NCC(=O)N[C@
> @H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N')
> >>> hierarch = Recap.RecapDecompose(mol)
> >>> ks = hierarch.GetLeaves().keys()
>
> I imagined it would be slow for this molecule, but 8 hours might be an
> issue rather than a feature!
> Best,
> Nick
>
> *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
> The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton |
> Surrey | SM2 5NG
>
> *T* 020 8722 4033 | *E* nic...@ic... | *W* www.icr.ac.uk |
> *Twitter* @ICRnews <https://twitter.com/ICRnews>
>
> *Facebook* www.facebook.com/theinstituteofcancerresearch
>
> *Making the discoveries that defeat cancer*
>
> <image001.gif>
>
> On 10 Jun 2014, at 20:53, Dimitri Maziuk <dm...@bm...> wrote:
>
> On 06/10/2014 01:48 PM, Nicholas Firth wrote:
>
> I still have plenty of CPU's and memory available though, so this
>
> seems odd. Some of the processes have done nothing and the others seem
> to have frozen at different times.
>
> Yeah. Parallel processing is often not quite that straightforward.
>
> For instance, since you say they're writing to files, how's your disk i/o?
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
>
>
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No.
> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>
> This e-mail message is confidential and for use by the addressee only. If
> the message is received by anyone other than the addressee, please return
> the message to the sender by replying to it and then delete the message
> from your computer and network.
>
>
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
> _______________________________________________
> Rdkit-discuss mailing list
> Rdk...@li...
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
|