Re: [Rdkit-discuss] Exhaustive Library Enumeration
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Andy J. <and...@gm...> - 2018-01-17 22:08:36
|
Hi Christos, Many thanks for the reply. I hadn't appreciated that the presence of a single invalid reagent would bring the entire thing crashing down, rather than issuing a warning/error and moving onto other molecules in the set. Good to know, and I'll have to be less lazy in my code ;-) Best, Andy On Wed, Jan 17, 2018 at 1:56 PM, Christos Kannas <chr...@gm...> wrote: > Hi Andy, > > The reason that your code breaks is that the second product of the third > iteration ( 'NCCCCN(Cc1ccccc1)(Cc1ccccc1)Cc1ccccc1') is not a valid > molecule. > And when calling Chem.MolFromSmiles( 'NCCCCN(Cc1ccccc1)(Cc1ccccc1)Cc > 1ccccc1') it creates a None object. > So you have to filter out the molecules that are not valid. > > See this Jupyter Notebook > <https://gist.github.com/CKannas/11bb9bcaa9435dd18a0bb969501219b2> at > cell 5 the 1st line in the while loop. > > Best, > > Christos > > Christos Kannas > > Chem[o]informatics Researcher & Software Developer > > [image: View Christos Kannas's profile on LinkedIn] > <http://cy.linkedin.com/in/christoskannas> > > On 17 January 2018 at 18:16, Andy Jennings <and...@gm...> > wrote: > >> Hi RDKitters, >> >> I have a question and an observation on the topic of library enumeration. >> >> First, the question: is there a call within RDKit to trigger the >> exhaustive reaction of reagents? For example, if I have two reagents - a >> primary amine and an akyl chloride - can I tell RDKit to enumerate the >> reaction as though there were an excess of each reagent? In my case here >> the reaction would continue until the alkylation can no longer occur >> because there are no more valences available on the amine and I would >> either be tri-alkylated for a neutral product or quat-alkylated for a >> positively charged product >> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R >> >> This brings me to my observation. When I try to attempt exactly this by >> repeatedly exposing the product to the reagent again I am able to drive it >> to exhaustion *in some cases*. >> >> For example, in the example above where RCl is benzyl chloride and my >> smirks is: >> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]' >> I do drive the final product to be exclusively the tri-akylated amine. >> Success. >> >> However, when I attempt the same thing with an amine with more than one >> reactive nitrogen (e.g. NCCCCN) I don't get a single product with 6 >> alkylations, I get two unique product each with three alkylations. One >> product has two alkylations on the first nitrogen and one on the second, >> the other product has three alkylations on the first nitrogen and none on >> the second. Attempting to drive the reaction once again leads to a >> 'reaction called with None reactants' ValueError. My dreadful code is below >> and the output is >> Reaction 1: ['NCCCCNCc1ccccc1'] >> Reaction 2: ['NCCCCN(Cc1ccccc1)Cc1ccccc1', 'c1ccc(CNCCCCNCc2ccccc2)cc1'] >> Reaction 3: ['c1ccc(CNCCCCN(Cc2ccccc2)Cc2ccccc2)cc1', >> 'NCCCCN(Cc1ccccc1)(Cc1ccccc1)Cc1ccccc1'] >> Reaction 4: ValueError >> >> Any pointers would be great, as would any pre-existing library >> enumeration code. The examples I've found shipped with RDKit don't appear >> to allow me to name the products using a combination of the reagent names >> (useful for tracking library content). >> >> Best, >> Andy >> >> #### Code snippet #### >> >> amine = Chem.MolFromSmiles('NCCCCN') >> acyl = Chem.MolFromSmiles('c1ccccc1CCl') >> rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:1 >> ].[Cl:3]') >> >> # First reaction >> reactantListMols = [amine,acyl] >> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,r >> eactantListMols]) >> prods = list(prods) >> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in >> prods])) >> print smis >> # ['NCCCCNCc1ccccc1'] >> >> # Now repeat until doom >> for i in range(0,10): >> oldproducts = [Chem.MolFromSmiles(x) for x in smis] >> reactantListMols = oldproducts + [acyl] >> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,r >> eactantListMols]) >> prods = list(prods) >> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in >> prods])) >> print smis >> >> #### End Code #### >> >> >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > |