#281 Make checkers reconfigurable

release_3.4
closed
nobody
5
2012-10-10
2004-02-03
No

My Eclipse plugin uses checkstyle to check source code
every time an editor is saved or a project is build. To
increase performance it is desirable to reuse already
configured checkers because the set of checks is
(relatively) constant.
In order to be able to check different projects with the
same checker it is necessary to set a different
classloader for the checked project.
For now setting the classloader is only working if done
BEFORE configuring the checker.
Setting a classloader after configuration of the checker
does not work because the classloader does not make it
into the context of the checkers childs.

In short: it is needed to set context values after a
checker is configured.

Discussion

  • Lars Kühne
    Lars Kühne
    2004-02-04

    Logged In: YES
    user_id=401384

    Changed summary to better reflect the request (I hope).

    I wonder if the performance gain would be spectacular, the
    thing that is a bit expensive to build is typically the
    configuration (requires XML parsing), not the Checker.

    Once you have the configuration object, building the checker
    costs virtually no time, at least it should be orders of
    magnitudes faster than the actual checking (parsing java
    files and performing the check logic). I think that caching
    the configuration object is possible with the current API.

    The change you request is pretty complex, e.g. we'd have to
    make sure that checks are not reconfigured while they are
    running. Therefore, could you please demonstrate (e.g. by
    supplying profiling data) that this change would really
    improve performance? When checking a file/project of
    realistic size, using a typical configuration like
    sun_checks.xml, how many percent of the runtime is spent
    creating a checker?

    Out of curiosity: Are you talking about the Eclipse-CS
    plugin (http://eclipse-cs.sourceforge.net/) or are you
    developing your own plugin? The performance of Eclipse-CS
    seems pretty much OK to me...

     
  • Logged In: YES
    user_id=966793

    I am developing my own plugin because the Eclipse-GS plugin
    lacks some features we need. For instance our checkstyle
    users must not modify the configuration or we need
    other/more complex filters to exclude different files from the
    checks.
    If I find the time I will contact the Eclipse-GS plugin folks with
    some proposals.

    OK, I did some simple testing (no serious profiling) and
    instrumented the creation/configuration of the checkers some
    simple time measurements.

    /*
    * Creates a new checker and configures it with the given
    configuration file
    * @param config location of the configuration file
    * @param propResolver a property resolverr null
    * @return the newly created Checker
    * @throws CheckstyleException an exception during the
    creation of the checker occured

    /
    private static Checker createCheckerInternal(URL config,
    PropertyResolver propResolver) throws CheckstyleException {

    long time1 = System.currentTimeMillis();
    
    //load configuration
    Configuration configuration = ConfigurationLoader.
    

    loadConfiguration(config.toString(), propResolver);

    long time2 = System.currentTimeMillis();
    
    System.out.println("Loading config: " + (time2 - time1));
    
    time1 = System.currentTimeMillis();
    
    //create and configure checker
    Checker checker = new Checker();
    checker.configure(configuration);
    
    time2 = System.currentTimeMillis();
    
    System.out.println("Creating checker: " + (time2 - time1));
    
    return checker;
    

    }

    I used my own config files which are split up in thematic
    groups to increase maintainability:
    The result for the different configs was:

    Loading config: 47
    Creating checker: 578
    Loading config: 47
    Creating checker: 281
    Loading config: 47
    Creating checker: 47
    Loading config: 15
    Creating checker: 141
    Loading config: 63
    Creating checker: 187
    Loading config: 31
    Creating checker: 313
    Loading config: 16
    Creating checker: 109
    Loading config: 31
    Creating checker: 187
    Loading config: 31
    Creating checker: 719
    Loading config: 31
    Creating checker: 203
    Loading config: 0
    Creating checker: 47
    Loading config: 0
    Creating checker: 63
    Loading config: 31
    Creating checker: 141
    Loading config: 31
    Creating checker: 281

    For sun_checks.xml the result was:
    Loading config: 203
    Creating checker: 2375

    (My System: AMD Athlon 2600+, 1 Gig RAM)

    As you can roughly see creating the configurations is not that
    expensive but configuring the checker actually is.

    You are right by saying that creating and configuring a
    checker is irrelevant if a complete project is checked with this
    checker.
    In the eclipse plugin the situation is a bit different, because
    the checking process is automatically invoked for a single file if
    the changes to this file are saved (eclipse autobuild feature).
    In this case creating a new checker would be extremely
    uneffective, because such an autobuild has to be fast. So a
    1+ sec. delay for the checkstyle run on each save would be
    quite painful for a user thus lowering acceptance for the
    checkstyle plugin.

    Please tell me if you need more measurements or other data.

    Cheers,
    Lars

     
  • Logged In: YES
    user_id=966793

    Hi,

    I gathered some additional data to show how beneficial it
    would be if reusing of checkers would be (more) supported.
    As test base I used my plugin project which contains 39 files.
    As config file I used sun_checks.xml.
    I tested full project builds with and without caching of
    checkers (10 times each).
    Additionally I tested single file checkstyle runs, which are done
    in case of an eclipse autobuild. Again 10 times each with and
    without caching.

    While the improvement for a full build seems ok too me, the
    benefits of the checker caching for single file runs are
    astonishing. Apart from the numbers eclipse feels ways more
    responsive to me if the checkers are cached while without
    chaching there is always that 1+ sec. delay on saving of a file.

    I hope you find the gathered data useful.

    Cheers,
    Lars

     
  • Performance data to show impact of checker reusage

     
  • Logged In: YES
    user_id=966793

    I forgot, please see the attached file for the data

     
  • Lars Kühne
    Lars Kühne
    2004-02-08

    Logged In: YES
    user_id=401384

    I can confirm that configuring the checker is much more
    expensive that parsing the XML file. On my P4 3GHz the
    numbers are much lower than what rookie_no2 reported, but I
    get the same relative times:

    load configuration = 51
    create and configure checker: = 596

    (is my machine really that much faster than an Athlon 2600?)

    I'm surprised. I really expected that accessing files and
    parsing XML would really be much slower that instantiating a
    few objects and calling some setters.

    I have to admit that I'm a bit confused: Is it sufficient
    for you to create the checker (or several of them - one for
    each project) and keep them in memory for reuse OR do you
    really need to reconfigure an existing checker? This is a
    really big difference. The former is possible today (as you
    demonstrate) and I think the latter won't happen because it
    would break existing check implementations.

    Anyway, we need to profile what's going on in the
    configuration phase... If we can speed up the configuration
    phase, the whole thing might become a non-issue.
    Unfortunately I don't have the time to do that right now (job).

     
  • Logged In: YES
    user_id=966793

    I have several fixed configuration files which are all used
    together to check different projects.
    So once I created the checkers I can reuse them to check
    different projects.
    The only thing that hinders me is that I cannot set a
    classloader for the actual project to check after I created and
    configured the checkers.
    I think that popagation the classloader change through the
    object hierarchy should be somewhat easier to implement.
    Actually I worked myself on a patch a bit, if this works out I
    will post this here.

    As for the Athlon 2600: It's the developer machine on my job
    which is really badly configured and running several
    background services. So sometimes I have the fealing that my
    Athlon 900 (512 MB SD-RAM) at home outperformes this crook.
    But I am happy that the relative times are matching...

     
  • Logged In: YES
    user_id=966793

    Ok, I patched some files to distibute a classloader change
    through the object hierarchy. It is kind of dirty because I
    found no generalized way.
    I introduced a new method 'reContextualizeChilds' on Checker
    and TreeWalker which is called within 'setClassLoader'.
    Additionally I had to provide a hook within 'Check' to notify a
    subclass of a changed classloader (handleClassLoaderChanged)
    . This had to be done because AbstractTypeAwareCheck
    wraps the ClassLoader with a ClassResolver and there was no
    other way to invalidate the ClassResolver.
    All in all it works for my needs but as I said it's not too nice.
    I think this is because in the current api the static aspects
    (configuration) and the dynamic aspect (contextualisation) are
    too much mixed...
    I added the patched files to this request.

     
  •  
    Attachments
  • Logged In: YES
    user_id=746148

    I think that we can/should resolve this problem using
    reconfigurable class loader (not Checker). It will be much
    easier.

     
  • Logged In: YES
    user_id=966793

    Ok, I followed your great idea and it works.
    That resolves this issue for me.
    With your agreement I will close this RFE.

    Thank you all very much!