Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Paolo T. <pao...@gm...> - 2020-05-19 16:30:56
|
Hi Theo, I don't think the RDKit version should make a difference; did you notice that rdmolops.AdjustQueryProperties() does not modify the molecule in place, but rather returns a modified copy? pattern_generic_bonds = Chem.AdjustQueryProperties(pattern, query_params) That might be the reason. Also, only pattern_generic_bonds will have UNSPECIFIED bonds, the mols will still have SINGLE and DOUBLE bonds. Feel free to contact me off-list if you need help with the above. Cheers, p. On 19/05/2020 17:01, theozh wrote: > Hi Paolo, > > thank you very much for your detailed answer. > I tried to reproduce your last suggestion (but I don't have Jupyter Notebook). > However, my bonds are still SINGLE and DOUBLE instead of UNSPECIFIED. > Does this maybe depend on the RDKit Version, I have 2019.03... ? > > Maybe, I should update and need to investigate further. > Theo. > > > Am 19.05.2020 um 16:44 schrieb Paolo Tosco: >> Hi Theo, >> >> the lack of match is due to different aromaticity flags on atoms and bonds in the larger molecule. >> >> This gist provides some explanation and a possible solution: >> >> https://gist.github.com/ptosco/e410e45278b94e8f047ff224193d7788 >> >> Cheers, >> p. >> >> On 19/05/2020 14:13, theozh wrote: >>> Dear RDKit-users, >>> >>> I would like to do a very simple substructure search. >>> The chapter 3.5 "Substructure Searching" in RDKit Documentation (2019.09.1) is pretty short and doesn't point to a solution. So far, I've learned that you can create your search pattern via Chem.MolFromSmiles() or Chem.MolFromSmarts(). >>> >>> In the below copy&paste minimal example, I want to use the first SMILES in the list as search pattern. I expect 2 matches but I get either 1 or 0 matches. So, I'm doing something wrong. What am I missing? >>> Is it about implicit/explicit aromatic and aliphatic bonds or some explicit/implicit hydrogen? >>> How to find the first structure in both SMILES? >>> >>> thank you for any hints, >>> Theo. >>> >>> ### simple substructure search (but doesn't find what is expected) >>> from rdkit import Chem >>> >>> smiles_strings = ''' >>> C12=CC=CN1NCCC2 >>> C12=CC=CC(C=C3)=C1N3NCC2 >>> ''' >>> smiles_list = smiles_strings.splitlines()[1:] >>> print(smiles_list) >>> >>> pattern = Chem.MolFromSmiles(smiles_list[0]) # MolFromSmiles >>> matches = [x for x in smiles_list if Chem.MolFromSmiles(x).HasSubstructMatch(pattern)] >>> print(len(matches)) # result: 1, why not 2? >>> >>> pattern = Chem.MolFromSmarts(smiles_list[0]) # MolFromSmarts >>> matches = [x for x in smiles_list if Chem.MolFromSmiles(x).HasSubstructMatch(pattern)] >>> print(len(matches)) # result: 0, why not 2? >>> ### end of code >>> >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdk...@li... >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |