Re: [Rdkit-discuss] Request for Assistance with MACCS 166 Fingerprint Calculation for 3D QSAR Study
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2024-04-23 14:20:24
|
Hi, Please do not duplicate questions/posts between the mailing list and github discussions. That's spamming the community. -greg On Tue, Apr 23, 2024 at 4:10 PM Ariadna Llop Peiró <ari...@gm...> wrote: > Hello everyone, > > I'm currently working with a dataset of chemical compounds, aiming to > cluster them into different series to create a 3D-QSAR model. Up to this > point, I've been using Morgan Fingerprints to generate the descriptors and > cluster the compounds based on their Tanimoto Similarity: > > ``` > # Generate fingerprint descriptor database > fps = [AllChem.GetMorganFingerprintAsBitVect(m, 2) for m in mols] > > > # Calculate pairwise Tanimoto similarity between fingerprints > similarity_matrix = [] > for i in range(len(fps)): > similarities = [] > for j in range(len(fps)): > similarities.append(DataStructs.TanimotoSimilarity(fps[i], fps[j])) > > similarity_matrix.append(similarities) > ``` > > > With the similarity matrix, I applied hierarchical clustering based on a > Tanimoto Similarity threshold to group similar compounds: > > ``` > # Cluster based on Tanimoto similarity > dists = 1 - np.array(similarity_matrix) > hc = hierarchy.linkage(squareform(dists), method='single') > > # Specify a distance threshold or number of clusters > threshold = 0.6 # Adjust this value based on your dendrogram and > similarity values > clusters = hierarchy.fcluster(hc, threshold, criterion='distance') > ``` > > However, I'm not satisfied with the results and would like to experiment > with MACCS Keys to see if they yield better clustering outcomes. Does > anyone know how to cluster compounds using MACCS fingerprints? Any insights > on the best approach to calculate similarities and cluster using these > fingerprints would be highly appreciated. > > Thank you in advance for your suggestions! > > Ariadna Llop > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |