The Chemistry Development Kit / Bugs / #1303 SMSD VFSTATE - peculiar restriction on query/target size ratio

SMSD VFSTATE - peculiar restriction on query/target size ratio

#1303 SMSD VFSTATE - peculiar restriction on query/target size ratio

Milestone: master

Status: open

Owner: nobody

Labels: smsd (4)

Priority: 1

Updated: 2016-08-10

Created: 2013-06-06

Creator: Duece99

Private: No

Hello,

I came across this odd piece of code in org.openscience.cdk.smsd.algorithm.vflib.map.VFState

    /** {@inheritDoc} */
    public boolean isDead() {
        return query.countNodes() > target.getAtomCount();
    }

Why is this the case? It prevents me from say, performing a SMARTS match where the SMARTS query is larger than the molecule being tested on.
Whilst that seems pointless (i.e there would never be a isomorphic/100% match), I still need to do this to get partial matches for similarity calculations.

I've manually edited this out of the code at my workstation atm and it works fine, so I don't really see the need for this as it simply prevents anyone from matching a larger query. Strange, coz the algorithm itself is super-fast for graph matching otherwise.

Ed.

Discussion

Duece99 - 2013-06-06

This' when I use it with the Isomorphism class.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John May - 2013-06-06

If you're looking at sub-graph isomorphism your query can not be larger than the target. If you want a partial match, maximum common subgraph (MCS) is what you need.

J

Last edit: John May 2013-06-06

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Duece99 - 2013-06-06

Yeah I'm using the MCS - the VFLib MCS algorithm supplied with CDK. That's where the error's occuring.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John May - 2013-06-06

got the query/target?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Duece99 - 2013-06-06

It's another of these cases where I'm using IQueryAtomContainers rather than IAtomContainers (for the SMARTS-matching subgraph isomorphism aspect of my work). Here's an example:

target SMILES: O=C(OCC)CN2C5=CC=C(C=C5(C1=NC=NC(=C12)N4CCN(CCC=3C=CC(F)=C(F)C=3)CC4))N+[O-]

query SMARTS: CCOC(=O)CN8C~2C(~NC(CN1CCOCC1)~NC~2N4CCN(CCC~3C~CC(F)~C(F)C~3)CC4)C~5C8(~C(N(=O)O)C=6=7(NC~5(NC)(C=6(N(=O)O))C=7(N(O)=O)))

So the idea is that I get the maximum common substructure between these & use it to calculate Tanimoto Bond similarity.

Last edit: Duece99 2013-06-06

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Egon Willighagen - 2013-08-02

labels: --> smsd
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Egon Willighagen - 2013-08-02

Group: cdk-1.0.x --> master
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Egon Willighagen - 2013-08-02

This will only be fixed in master.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Egon Willighagen - 2016-08-10

I don't think anymore that this SMSD stack will be fixed at all. Please use the latest SMSD code from the Asad.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SMSD VFSTATE - peculiar restriction on query/target size ratio

Group

Searches

Help

#1303 SMSD VFSTATE - peculiar restriction on query/target size ratio

Discussion