[Rdkit-discuss] Is it possible to retrieve the substructure atom indices from a postgresql query?
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Jose M. G. <jos...@gm...> - 2018-12-10 12:40:47
|
Hi all, I am using the RDKit postgresql cartridge to perform some substructure searches on a large number of molecules, as described here: https://www.rdkit.org/docs/Cartridge.html However, in addition to which row matched the query, I would also like to know what are the atom indices for each match. For now I am doing this in 2 consecutive steps, as shown below. Is there a way to achieve this in a single step from the postgresql query? Thanks! Best regards, Jose Manuel # 0/ initialization query_smiles = "c1ccccc1" query_mol = Chem.MolFromSmiles(query_smiles) # 1/ get substructure matches cur.execute(f"select mol_send(m) from rdk.mols where m@>'{query_smiles}' LIMIT 1") results = cur.fetchall() mols = [ Chem.Mol(m[0].tobytes()) for m in results ] # 2/ get substructure atom indexes: for m in mols: print(m.GetSubstructMatches(query_mol)) And for instance I get as results the substructures atom indices: ((5, 6, 7, 8, 16, 21), (9, 10, 11, 12, 14, 15), (22, 23, 24, 25, 33, 34)) |