[Rdkit-discuss] ANN: chemfp 5.0
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Andrew D. <da...@da...> - 2025-09-24 14:04:46
|
Hi RDKit users, I've released chemfp 5.0, my Python package for cheminformatics fingerprint generation, search, and analysis. You can install it on Linux-based OSes using: python -m pip install chemfp -i https://chemfp.com/packages/ (Append "--upgrade" if you have already installed it.) For a description of the changes since 4.2 see https://chemfp.com/docs/whats_new_in_50.html . The highlights are: • Update the FPB format to handle over 1 billion fingerprints. • New chemfp shardsearch command-line tool which does similarity search across multiple target files and merges the result. - Tested with the 977 million structures in GDB-13 • New chemfp simhistogram / chemfp simhist command-line tool and corresponding chemfp.simhistogram() high-level API function to create a histogram of similarity scores. • Initial support for count fingerprints: - new text-based FPC format based on the FPS format - rdkit2fpc tool which uses RDKit's sparse fingerprint generators - fpc2fps tool with various method to convert sparse count fingerprints to binary fingerprints • Fast implementations of the 4860-bit Klekota-Roth fingerprint for the OpenEye and RDKit toolkits. Cheers, Andrew Dalke da...@da... -- Have useful but old in-house cheminformatics software in need of refurbishment? No one left knows how it works or has the time? Perhaps I can help. Contact me. |