Hi Zheng,

You canít do this with the current API. For most implementations (in CDK) the feature vector is hashed and not reversible (one way). To do this, one would need to hold on to more info then they currently do. There is a getRawFingerprint() method which some implementations provide but this doesnít tell you which atom each feature was generated from.

It shouldnít be too difficult to encode paths / layers (i.e. using the AtomSignature) as a custom bit vector. Out of interest, what do you need this functionality for? There may be an alternative.


On 31 Jul 2014, at 22:56, Zheng Shi <zshi3@ualberta.ca> wrote:

The problem how to interpret the fingerprint. Suppose for a molecule, I get a vector of bits as its fingerprint(assume it's default length 1024,depth 8) , some of the bits are on(1), how do I know which part of the bits are for which atom in the molecule? Thanks.

On Thu, Jul 31, 2014 at 3:51 PM, John May <johnmay@ebi.ac.uk> wrote:
Hi Zheng,

Some fingerprints might do this internally but I donít think they (all) expose it. Depending on what you need, signatures may be an option. These capture the circular / layer information of atoms (and molecules). The relevant class is AtomSignature.


On 31 Jul 2014, at 22:45, Zheng Shi <zshi3@ualberta.ca> wrote:

Hi, thank you for answering my question.

I have some questions about fingerprint for a molecule. I just wonder if there is any API for calculating fingerprint for every atom in a molecule. I see in the cdk, the API about fingerprint usually takes a molecule as input, so I think this will return a molecule fingerprint. Like the code:
Molecule molecule = new Molecule();
IFingerprinter fingerprinter = new Fingerprinter();
BitSet fingerprint = fingerprinter.getFingerprint(molecule).
But then how to get an atom fingerprint with CDK? Suppose I want to get fingerprint for each atom, including all the paths, like 2-bond, 4-bond, rooted at a specific atom. Are there any APIs can compute atom fingerprint for each atom in a molecule? Thanks.