#645 Java SPI for QSAR Descriptors

Accepted
closed
nobody
qsar (2)
master
1
2013-07-31
2013-06-08
John May
No

The existing QSAR loading mechanism for DescriptorEngine depended on the fact that there were physical 'jars' to search for implementations. This of course isn't always the case and switching the maven the test's don't run as 'jars' and can't load the descriptors. The engine now uses the input Service Provider Interface (SPI). This also eliminates the use of int parameter to specify which descriptors to load.

new DescriptorEngine(IMolecualrDescriptor.class, builder);

// previously
int MOLECULAR = 3
new DescriptorEngine(MOLECULAR, builder);

The service loader also simplifies users adding custom descriptor. Instead of pulling by search the packages the SPI 'pushes' the implementations to the loader.

One issue was the qsarionpot and qsarprotein are higher in the hierarchy and so can not be run in DescriptorEngine test. This was always the case but was simply silently not picking them up. To correct these would need their own seperate 'org.openscience.cdk.qsar.IMolecularDescriptor' classes. This is not possible because the current build can't put two different files of the same to different jars in the output. Alternatively perhaps qsarionpot (3 classes) and qsarprotein (2 classes) could be merged back to qsarmolecular etc. Modules are great but in this case I'm not sure there's a use case for the overhead.

This was part of the module-unification but I realised this could actually be added separately.

branch: feature/descriptors-spi

Discussion

  • John May

    John May - 2013-06-08
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -The existing QSAR loading mechanism for DescriptorEngine depended on the fact that there were physical 'jars' to search for implementations. This of course isn't always the case and switching the maven the test's don't run as 'jars' and can't load the descriptors. The engine now uses the input [Service Provider Interface (SPI)](http://en.wikipedia.org/wiki/Service_provider_interface). This also eliminates the use of int parameter to specify which descriptors to load.
    +The existing QSAR loading mechanism for DescriptorEngine depended on the fact that there were physical 'jars' to search for implementations. This of course isn't always the case and switching to maven the tests fail being unable to find the jars. The engine now uses the inbuilt [Service Provider Interface (SPI)](http://en.wikipedia.org/wiki/Service_provider_interface). This also eliminates the use of int parameter to specify which descriptors to load.
    
     ~~~~~
     new DescriptorEngine(IMolecualrDescriptor.class, builder);
    @@ -8,10 +8,10 @@
     new DescriptorEngine(MOLECULAR, builder);
     ~~~~~
    
    -The service loader also simplifies users adding custom descriptor. Instead of pulling by search the packages the SPI 'pushes' the implementations to the loader.
    +The service loader also simplifies users adding custom descriptor. Instead of pulling by search the packages, the SPI 'pushes' the implementations to the loader.
    
    -One issue was the qsarionpot and qsarprotein are higher in the hierarchy and so can not be run in DescriptorEngine test. This was always the case but was simply silently not picking them up. To correct these would need their own seperate 'org.openscience.cdk.qsar.IMolecularDescriptor' classes. This is not possible because the current build can't put two different files of the same to different jars in the output. Alternatively perhaps qsarionpot (3 classes) and qsarprotein (2 classes) could be merged back to qsarmolecular etc. Modules are great but in this case I'm not sure there's a use case for the overhead.
    +One issue was the qsarionpot and qsarprotein are higher in the hierarchy and so can not be run in DescriptorEngine test. This was always the case but was simply silently not picking them up. To correct this, these would need their own seperate 'org.openscience.cdk.qsar.IMolecularDescriptor' classes. This is currently not possible because the current build system can't put two different files of the same to different jars in the output. Alternatively perhaps qsarionpot (3 classes) and qsarprotein (2 classes) could be merged back to qsarmolecular etc. Modules are great but in this case I'm not sure there's a use case for the overhead.
    
    -This was part of the module-unification but I realised this could actually be added separately. 
    +This was part of the module-unification but I realised this could actually be added separately. Makes that branch a little shorter.
    
     branch: [feature/descriptors-spi](https://github.com/johnmay/cdk/commits/feature/descriptors-spi)
    
     
  • Egon Willighagen

    Rajarshi, can you please have a look at this? I like the idea.

     
  • Rajarshi Guha

    Rajarshi Guha - 2013-06-08

    Nice and clean solution to discovery. I assume that if a user implemented a descriptor and packaged it into a JAR file, they would simply need to add something like META-INF/services/com.my.descriptors to their JAR file and the ServiceLoader would pick it up (?)

    If that's the case, this looks good to go.

     
  • John May

    John May - 2013-06-08

    Yep, thats exactly it. One needs to be careful when assembling uber-jars as the files need to be concatenated together. However, any decent jar assembler has a built in option for this, e.g. ServicesResourceTransformer.

     
    Last edit: John May 2013-06-08
  • John May

    John May - 2013-06-11

    Okay have redone without the abstract classes - branch /johnmay/cdk/descriptors-spi.

    I tried to do the service files from the @cdk.sets as suggested but the MakeJavafilesFiles which reads the sets but currently puts them to a single directory. I think it's best to leave as it is, the more non-standard changes to the build systems complicate things later.

    Just have to remember to add in the ionpot/protein files when we have separate source trees.

     
    Last edit: John May 2013-06-11
  • Egon Willighagen

    What I always understood from Taverna, is that any jar can have a file like that... have you tried adding a ...IAtomicDescritptor META-INF file to qsarionprot?

    Because each Taverna plugin has a file for a common interface for Taverna to pick it up...

     
  • Egon Willighagen

    • status: open --> closed
    • Group: Needs_Review --> Accepted
     
  • John May

    John May - 2013-08-01

    Yes that is true but I couldn't do it with the current build configuration in ant. The cdk *.set files are assembled in a temp directory and all have different names - these all have the same name and so it doesn't work :(.

    J

    On 31 Jul 2013, at 20:15, Egon Willighagen egonw@users.sf.net wrote:

    What I always understood from Taverna, is that any jar can have a file like that... have you tried adding a ...IAtomicDescritptor META-INF file to qsarionprot?

    Because each Taverna plugin has a file for a common interface for Taverna to pick it up...

    [patches:#645] Java SPI for QSAR Descriptors

    Status: open
    Labels: qsar
    Created: Sat Jun 08, 2013 04:14 PM UTC by John May
    Last Updated: Tue Jun 11, 2013 09:11 PM UTC
    Owner: nobody

    The existing QSAR loading mechanism for DescriptorEngine depended on the fact that there were physical 'jars' to search for implementations. This of course isn't always the case and switching the maven the test's don't run as 'jars' and can't load the descriptors. The engine now uses the input Service Provider Interface (SPI). This also eliminates the use of int parameter to specify which descriptors to load.

    new DescriptorEngine(IMolecualrDescriptor.class, builder);

    // previously
    int MOLECULAR = 3
    new DescriptorEngine(MOLECULAR, builder);
    The service loader also simplifies users adding custom descriptor. Instead of pulling by search the packages the SPI 'pushes' the implementations to the loader.

    One issue was the qsarionpot and qsarprotein are higher in the hierarchy and so can not be run in DescriptorEngine test. This was always the case but was simply silently not picking them up. To correct these would need their own seperate 'org.openscience.cdk.qsar.IMolecularDescriptor' classes. This is not possible because the current build can't put two different files of the same to different jars in the output. Alternatively perhaps qsarionpot (3 classes) and qsarprotein (2 classes) could be merged back to qsarmolecular etc. Modules are great but in this case I'm not sure there's a use case for the overhead.

    This was part of the module-unification but I realised this could actually be added separately.

    branch: feature/descriptors-spi

    Sent from sourceforge.net because cdk-patches@lists.sourceforge.net is subscribed to https://sourceforge.net/p/cdk/patches/

    To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/cdk/admin/patches/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.


    Get your SQL database under version control now!
    Version control is standard for application code, but databases havent
    caught up. So what steps can you take to put your SQL databases under
    version control? Why should you start doing it? Read more to find out.
    http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk_______
    Cdk-patches mailing list
    Cdk-patches@lists.sourceforge.net
    https://lists.sourceforge.net/lists/listinfo/cdk-patches

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks