[Rdkit-discuss] ANN: chemfp 5.0
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Andrew D. <da...@da...> - 2025-09-24 14:04:46
|
Hi RDKit users,
I've released chemfp 5.0, my Python package for cheminformatics
fingerprint generation, search, and analysis. You can install it on
Linux-based OSes using:
python -m pip install chemfp -i https://chemfp.com/packages/
(Append "--upgrade" if you have already installed it.)
For a description of the changes since 4.2 see
https://chemfp.com/docs/whats_new_in_50.html .
The highlights are:
• Update the FPB format to handle over 1 billion fingerprints.
• New chemfp shardsearch command-line tool which does similarity
search across multiple target files and merges the result.
- Tested with the 977 million structures in GDB-13
• New chemfp simhistogram / chemfp simhist command-line tool and
corresponding chemfp.simhistogram() high-level API function
to create a histogram of similarity scores.
• Initial support for count fingerprints:
- new text-based FPC format based on the FPS format
- rdkit2fpc tool which uses RDKit's sparse fingerprint generators
- fpc2fps tool with various method to convert sparse count
fingerprints to binary fingerprints
• Fast implementations of the 4860-bit Klekota-Roth fingerprint
for the OpenEye and RDKit toolkits.
Cheers,
Andrew Dalke
da...@da...
--
Have useful but old in-house cheminformatics software in need of refurbishment?
No one left knows how it works or has the time? Perhaps I can help. Contact me.
|