Menu

#1303 SMSD VFSTATE - peculiar restriction on query/target size ratio

master
open
nobody
smsd (4)
1
2016-08-10
2013-06-06
Duece99
No

Hello,

I came across this odd piece of code in org.openscience.cdk.smsd.algorithm.vflib.map.VFState

    /** {@inheritDoc} */
    public boolean isDead() {
        return query.countNodes() > target.getAtomCount();
    }

Why is this the case? It prevents me from say, performing a SMARTS match where the SMARTS query is larger than the molecule being tested on.
Whilst that seems pointless (i.e there would never be a isomorphic/100% match), I still need to do this to get partial matches for similarity calculations.

I've manually edited this out of the code at my workstation atm and it works fine, so I don't really see the need for this as it simply prevents anyone from matching a larger query. Strange, coz the algorithm itself is super-fast for graph matching otherwise.

Ed.

Discussion

  • Duece99

    Duece99 - 2013-06-06

    This' when I use it with the Isomorphism class.

     
  • John May

    John May - 2013-06-06

    If you're looking at sub-graph isomorphism your query can not be larger than the target. If you want a partial match, maximum common subgraph (MCS) is what you need.

    J

     

    Last edit: John May 2013-06-06
  • Duece99

    Duece99 - 2013-06-06

    Yeah I'm using the MCS - the VFLib MCS algorithm supplied with CDK. That's where the error's occuring.

     
  • John May

    John May - 2013-06-06

    got the query/target?

     
  • Duece99

    Duece99 - 2013-06-06

    It's another of these cases where I'm using IQueryAtomContainers rather than IAtomContainers (for the SMARTS-matching subgraph isomorphism aspect of my work). Here's an example:

    target SMILES: O=C(OCC)CN2C5=CC=C(C=C5(C1=NC=NC(=C12)N4CCN(CCC=3C=CC(F)=C(F)C=3)CC4))N+[O-]

    query SMARTS: CCOC(=O)CN8C~2C(~NC(CN1CCOCC1)~NC~2N4CCN(CCC~3C~CC(F)~C(F)C~3)CC4)C~5C8(~C(N(=O)O)C=6=7(NC~5(NC)(C=6(N(=O)O))C=7(N(O)=O)))

    So the idea is that I get the maximum common substructure between these & use it to calculate Tanimoto Bond similarity.

     

    Last edit: Duece99 2013-06-06
  • Egon Willighagen

    • labels: --> smsd
     
  • Egon Willighagen

    • Group: cdk-1.0.x --> master
     
  • Egon Willighagen

    This will only be fixed in master.

     
  • Egon Willighagen

    I don't think anymore that this SMSD stack will be fixed at all. Please use the latest SMSD code from the Asad.