Thread: [Rdkit-discuss] mass replacement of External R-groups with many substituents
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Bennion, B. <ben...@ll...> - 2017-03-15 23:20:48
|
Hello All, I have looked around the email list and studied the daylight guide as well as the opensmiles website in hopes of solving my problem. External r-groups is a proposed extension but that is all I could find. It is possible that I have made it too complicated though. In discussions with my synthetic chemist we came up with 27 substituents two place around our molecule scaffold. I labeled the scaffolds with R1 in the position that the substitution will occur and attached a dummy label to the substituents. I thought that I could do a simple replacement rxn. However, I have not been successful. The smarts string so far is listed below. AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]-[c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]-[c;R:5]([*:9])-[c;R:6]') Basically just adding a group to positions along a benzene ring Thanks Brian Bennion |
From: Chris E. <cge...@gm...> - 2017-03-16 15:43:11
|
Hi Brian I'm by no means an expert in RDKit with Python, but until someone else comes along, here are a few thoughts. Your reaction SMARTS specifically defines aromatic carbons joined by single bonds which won't match an incoming benzene ring, and it's a bit redundant to specify that aromatic carbons are in rings. It's possibly also neater to specify the ring closure if you want to include the whole ring. Finally, I don't think that you can specify dummy atoms in a reaction SMARTS as you're attempting (it certainly doesn't work for me with R and R1) but you certainly can specify 'placeholder' atoms using real elements. I've found U, V & W handy for this in the past. A small Python script with a modified SMARTS which works for me is - from rdkit import Chem from rdkit.Chem import AllChem rxn = AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9] >> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1') ps = rxn.RunReactants((Chem.MolFromSmiles('c1ccccc1[U]'),Chem.MolFromSmiles('[V]Cl'))) Chem.MolToSmiles(ps[0][0]) This returns 'Clc1ccccc1'. Unless you have some specific reason to include the complete benzene ring, the reaction SMARTS can be simplified a lot further, to - [c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4] This will carry out your virtual reaction on any aromatic carbon carrying the appropriate placeholder element. I'd be interested to learn if there is an approved way to mark substitution positions in reactions like this without having to resort to the cheat of using obscure elements. Regards, Chris On 15 March 2017 at 23:20, Bennion, Brian <ben...@ll...> wrote: > Hello All, > > > > I have looked around the email list and studied the daylight guide as well > as the opensmiles website in hopes of solving my problem. External > r-groups is a proposed extension but that is all I could find. > > > > It is possible that I have made it too complicated though. > > > > In discussions with my synthetic chemist we came up with 27 substituents > two place around our molecule scaffold. > > I labeled the scaffolds with R1 in the position that the substitution will > occur and attached a dummy label to the substituents. I thought that I > could do a simple replacement rxn. However, I have not been successful. > > > > The smarts string so far is listed below. > > > > AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4] > -[c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R: > 4]-[c;R:5]([*:9])-[c;R:6]') > > > > Basically just adding a group to positions along a benzene ring > > > > Thanks > > Brian Bennion > > > > > > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |
From: Stephen R. <s.d...@go...> - 2017-03-16 17:17:05
|
You can match a dummy atom (*) with the SMARTS [#0] Steve On 16 Mar 2017 16:43, "Chris Earnshaw" <cge...@gm...> wrote: > Hi Brian > > I'm by no means an expert in RDKit with Python, but until someone else > comes along, here are a few thoughts. > > Your reaction SMARTS specifically defines aromatic carbons joined by > single bonds which won't match an incoming benzene ring, and it's a bit > redundant to specify that aromatic carbons are in rings. It's possibly also > neater to specify the ring closure if you want to include the whole ring. > Finally, I don't think that you can specify dummy atoms in a reaction > SMARTS as you're attempting (it certainly doesn't work for me with R and > R1) but you certainly can specify 'placeholder' atoms using real elements. > I've found U, V & W handy for this in the past. > > A small Python script with a modified SMARTS which works for me is - > from rdkit import Chem > from rdkit.Chem import AllChem > rxn = AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9] > >> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1') > ps = rxn.RunReactants((Chem.MolFromSmiles('c1ccccc1[U]'), > Chem.MolFromSmiles('[V]Cl'))) > Chem.MolToSmiles(ps[0][0]) > > This returns 'Clc1ccccc1'. > > Unless you have some specific reason to include the complete benzene ring, > the reaction SMARTS can be simplified a lot further, to - > [c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4] > This will carry out your virtual reaction on any aromatic carbon carrying > the appropriate placeholder element. > > I'd be interested to learn if there is an approved way to mark > substitution positions in reactions like this without having to resort to > the cheat of using obscure elements. > > Regards, > Chris > > On 15 March 2017 at 23:20, Bennion, Brian <ben...@ll...> wrote: > >> Hello All, >> >> >> >> I have looked around the email list and studied the daylight guide as >> well as the opensmiles website in hopes of solving my problem. External >> r-groups is a proposed extension but that is all I could find. >> >> >> >> It is possible that I have made it too complicated though. >> >> >> >> In discussions with my synthetic chemist we came up with 27 substituents >> two place around our molecule scaffold. >> >> I labeled the scaffolds with R1 in the position that the substitution >> will occur and attached a dummy label to the substituents. I thought that >> I could do a simple replacement rxn. However, I have not been successful. >> >> >> >> The smarts string so far is listed below. >> >> >> >> AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]- >> [c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:4 >> ]-[c;R:5]([*:9])-[c;R:6]') >> >> >> >> Basically just adding a group to positions along a benzene ring >> >> >> >> Thanks >> >> Brian Bennion >> >> >> >> >> >> >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |
From: Greg L. <gre...@gm...> - 2017-03-17 04:26:12
|
Combining Steve's and Chris' answer gets to how I would do it with reactions: In [17]: core = Chem.MolFromSmiles('*c1c(C)cccc1(O)') In [18]: chain = Chem.MolFromSmiles('CN*') In [19]: rxn = AllChem.ReactionFromSmarts('[c:1][#0].[#0][*:2]>>[c:1]-[*:2]') In [20]: ps = rxn.RunReactants((core,chain)) In [21]: Chem.MolToSmiles(ps[0][0],isomericSmiles=True) Out[21]: 'CNc1c(C)cccc1O' If you can generate your sidechains in such a way that the atom you want to attach to the core is the first one (atom zero), there's a somewhat easier way to do this kind of simple replacement: In [29]: chain2 = Chem.MolFromSmiles('NC') In [30]: ps2 = Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#0]'),chain2) In [31]: Chem.MolToSmiles(ps2[0],isomericSmiles=True) Out[31]: 'CNc1c(C)cccc1O' The reaction-based approach is, however, far more flexible. We've talked about adding a simple function for doing sidechain-core enumeration but I'm pretty sure it hasn't yet actually been done. -greg On Thu, Mar 16, 2017 at 6:16 PM, Stephen Roughley < s.d...@go...> wrote: > You can match a dummy atom (*) with the SMARTS [#0] > > Steve > > On 16 Mar 2017 16:43, "Chris Earnshaw" <cge...@gm...> wrote: > >> Hi Brian >> >> I'm by no means an expert in RDKit with Python, but until someone else >> comes along, here are a few thoughts. >> >> Your reaction SMARTS specifically defines aromatic carbons joined by >> single bonds which won't match an incoming benzene ring, and it's a bit >> redundant to specify that aromatic carbons are in rings. It's possibly also >> neater to specify the ring closure if you want to include the whole ring. >> Finally, I don't think that you can specify dummy atoms in a reaction >> SMARTS as you're attempting (it certainly doesn't work for me with R and >> R1) but you certainly can specify 'placeholder' atoms using real elements. >> I've found U, V & W handy for this in the past. >> >> A small Python script with a modified SMARTS which works for me is - >> from rdkit import Chem >> from rdkit.Chem import AllChem >> rxn = AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9] >> >> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1') >> ps = rxn.RunReactants((Chem.MolFromSmiles('c1ccccc1[U]'),Chem. >> MolFromSmiles('[V]Cl'))) >> Chem.MolToSmiles(ps[0][0]) >> >> This returns 'Clc1ccccc1'. >> >> Unless you have some specific reason to include the complete benzene >> ring, the reaction SMARTS can be simplified a lot further, to - >> [c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4] >> This will carry out your virtual reaction on any aromatic carbon carrying >> the appropriate placeholder element. >> >> I'd be interested to learn if there is an approved way to mark >> substitution positions in reactions like this without having to resort to >> the cheat of using obscure elements. >> >> Regards, >> Chris >> >> On 15 March 2017 at 23:20, Bennion, Brian <ben...@ll...> wrote: >> >>> Hello All, >>> >>> >>> >>> I have looked around the email list and studied the daylight guide as >>> well as the opensmiles website in hopes of solving my problem. External >>> r-groups is a proposed extension but that is all I could find. >>> >>> >>> >>> It is possible that I have made it too complicated though. >>> >>> >>> >>> In discussions with my synthetic chemist we came up with 27 substituents >>> two place around our molecule scaffold. >>> >>> I labeled the scaffolds with R1 in the position that the substitution >>> will occur and attached a dummy label to the substituents. I thought that >>> I could do a simple replacement rxn. However, I have not been successful. >>> >>> >>> >>> The smarts string so far is listed below. >>> >>> >>> >>> AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]- >>> [c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:4 >>> ]-[c;R:5]([*:9])-[c;R:6]') >>> >>> >>> >>> Basically just adding a group to positions along a benzene ring >>> >>> >>> >>> Thanks >>> >>> Brian Bennion >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------ >>> ------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdk...@li... >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> >> ------------------------------------------------------------ >> ------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |