#609 New RingSearch

Accepted
closed
nobody
master
1
2013-02-20
2013-02-03
John May
No

master+/04aa9f957a

This patch adds the faster ring detection from this post and can replace some functionality of the SpanningTree. Check the RingSearch doc for a full explanation and a diagram but it's use can already improve performance. One trick we can now do is to remove absolutely everything we are sure cannot be composed of other cycles. These are cycles which share no bonds with any other cycle (i.e. a benzene ring on the end of an aliphatic chain). The other cycles we need to do more work (via SSSR etc) to work out if they can or cannot be composed but for the trivial cases they can be completely skipped. The SpanningTree already does this (see. getAllRings()) but I could not find a way to identify which rings were fused with changing the code.

Running on ChEBI

ChEBI : 20,000, |V| < 200, averaged over 50 runs

Implementation Mean (ms) Min - Max
SpanningTree+SSSRFinder: 2382.01 ± 658.33 2180.05 - 6255.88
RingSearch+SSSRFinder: 1798.90 ± 41.15 1767.25 - 1974.97
RingSearch/fused+SSSRFinder 1070.92 ± 43.82 1033.07 - 1333.13

Now, I know it is possible to get even faster (easily < 500 ms), the actual search it's self is v. quick on average ~ 15 ms for 20,000 structures but the conversion in and out takes a lot longer (i.e. IAtomContainer is slow). I've already got a relevant cycles which runs on single CPU ~230 ms for ChEBI and ~110 ms when run on multiple cores. Will patch that soon but needs more work.

Verifying the output is correct

This code below demonstrates a rough way to confirm the output is correct. I've run this code on PubChem-Compound CID 0 - CID 2,000,000 and have found no discrepancies but I'm going to run it on all 30,000,000 just be sure.

for (IAtomContainer m : molecules) {
    if (fastSSSR(m) != new SSSRFinder(m).findSSSR().getAtomContainerCount()) {
        System.err.println("SSSR did not match: " + m.getProperty("ChEBI ID"));
    }
}

private static int fastSSSR(IAtomContainer mol) {
    int count = 0;
    RingSearch tester = new RingSearch(mol);
    for (IAtomContainer fragment : tester.fusedRingFragments()) {
        count += new SSSRFinder(fragment).findSSSR()
                                         .getAtomContainerCount();
    }
    return count + tester.isolated().length;
}

Production

CyclicVertexSearch - tool working on a graph representation (no knowledge of a molecule)
Jumbo/RegularCyclicVertexSearch - implementations of CyclicVertexSearch
RingSearch - tool which maps results from the CyclicVertexSearch back to an IAtomContainer

Please review, comments welcome

Test

RingSearchTest - mocked tests confirming RingSearch calls CyclicVertexSearch correctly
RingSearchTest_* - specific cases on some molecules
Regular/JumboCyclicVertexSearchTest - unit tests working just on a graph

Discussion

  • John May
    John May
    2013-02-03

    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -12,7 +12,7 @@
     RingSearch+SSSRFinder:      |  1798.90 ± 41.15   | 1767.25 - 1974.97
     RingSearch/fused+SSSRFinder |  1070.92 ± 43.82   | 1033.07 - 1333.13
    
    -Now, I know it is possible to get even faster (easily < 500 ms), the actual search it's self is v. quick on average ~ 15 ms for 20,000 structures but the conversion in and out takes a lot longer (i.e. IAtomContainer is slow). I've already got a relevant cycles which runs on single CPU ~230 ms for ChEBI and ~110 ms when run on multiple cores.
    +Now, I know it is possible to get even faster (easily < 500 ms), the actual search it's self is v. quick on average ~ 15 ms for 20,000 structures but the conversion in and out takes a lot longer (i.e. IAtomContainer is slow). I've already got a relevant cycles which runs on single CPU ~230 ms for ChEBI and ~110 ms when run on multiple cores. Will patch that soon but needs more work.
    
     ### Verifying the output is correct
    
    @@ -43,6 +43,8 @@
     Jumbo/RegularCyclicVertexSearch - implementations of CyclicVertexSearch
     RingSearch - tool which maps results from the CyclicVertexSearch back to an IAtomContainer
    
    +
    +Please review, comments welcome
     ### Test
    
     RingSearchTest - mocked tests confirming RingSearch calls CyclicVertexSearch correctly
    
     
  • John May
    John May
    2013-02-18

    looks like this has been applied?

     
  • Yes, I have it in my master tree, so I guess I signed it off and merged it in :)

     
    • status: open --> closed
    • milestone: Needs_Review --> Accepted