Re: [Rdkit-discuss] question about morgan bits
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2021-03-12 05:58:20
|
Hi Wendong, The morgan fingerprint algorithm removes redundant atom environments (environments which contain exactly the same atoms/bonds). For example, when looking at valine: [image: image.png] The environments with radius 2 which are centered on atoms 5 and 6 are redundant with the environment of radius 1 which is centered on atom 4, so those environments are not reported in the output. This is described in more detail in the ECFP paper: Rogers, D.; Hahn, M. “Extended-Connectivity Fingerprints.” *J. Chem. Inf. and Model.* **50**:742-54 (2010). Best, -greg On Fri, Mar 12, 2021 at 4:16 AM Wendong Wang <wen...@sj...> wrote: > Greetings, > I have a question about morgan fingerprints. The code is pasted at the end > of the email, and please see the attached images for the results. > > For valine molecule, the radius is set to be 2. The dictionary (atom > index, radius) shows all the substructures of all atoms with radius 0 as > fingerprints, and all the substructures of all the atoms with radius 1 as > fingerprints. But there are only a few substructures with radius 2 as > fingerprints. Why so few? > > Thanks. > > Best wishes, > Wendong > > PS. The code is below: > m1 = Chem.MolFromSmiles('CC(C)[C@@H](C(=O)O)N') > di1 = {} > fp1 = AllChem.GetMorganFingerprintAsBitVect(m1, radius = 2, nBits = 2048, > bitInfo = di1) > tu1 = [(m1, x, di1) for x in fp1.GetOnBits()] > Draw.DrawMorganBits(tu1, molsPerRow = 4, legends=[str(x) for x in > fp1.GetOnBits()]) > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |