Re: [Rdkit-discuss] 64 bit Morgan Fingerpronts
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Gareth J. <jav...@gm...> - 2021-04-21 23:26:39
|
Wojtek, You can use GetNonzeroelements() to convert the sparse fingerprint to a Python Dict of hash to count. Cheers, Gareth In [7]: mol = Chem.MolFromSmiles('Cn1cnc2n(C)c(=O)n(C)c(=O)c12') In [8]: fp = AllChem.GetMorganFingerprint(mol, 2) In [9]: elements = fp.GetNonzeroElements(); In [10]: elements Out[10]: {10565946: 2, 348155210: 1, 476388586: 1, 540046244: 1, 553412256: 1, 864942730: 2, 909857231: 1, 1100037548: 1, 1333761024: 1, 1512818157: 1, 1981181107: 1, 2030573601: 1, 2041434490: 1, 2092489639: 3, 2246728737: 3, 2370996728: 1, 2877515035: 1, 2971716993: 1, 2975126068: 2, 3140581776: 1, 3217380708: 4, 3218693969: 1, 3462333187: 1, 3657471097: 3, 3796970912: 1} In [11]: On 4/21/2021 5:44 AM, Wojtek Plonka wrote: > Dear All > > Do any of you have a working example of getting Morgan Fingerprints, > as sparse bit vector (non-hashed) in the 64 bit version using Python? > I'm looking into the issue of collisions on the "main hash" on large > (100+ million molecules) data > Thank you very much! > Kindest regards, > > Wojtek Plonka > +48885756652 > wojtekplonka.com <http://www.wojtekplonka.com> > fb.com/wojtek.plonka <https://fb.com/wojtek.plonka> > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |