#483 Programmatically configure file encoding for source files

Check (274)

File encoding and code conventions are two orthogonal aspects of a source file. As such it is violating the principle "separation of concerns" that Checkstyle currently only allows to configure file encoding per check configuration. This potentially prevents reuse of config files among projects.

Consider some project that uses UTF-8 for its sources, another project that used Latin-1 and yet another that uses Big5. Even if all these projects were to enforce the the same code conventions, they would need separate Checkstyle configurations just for the sake of using the right charset. Keep in mind that using the default platform encoding is no option because the file encoding is conceptually specific to a project, not to a developer/machine/OS.

Especially for integration into build tools or IDEs, it would be helpful if Checkstyle offered a programmatic way to configure the file encoding to use for a particular invocation of the checker. The new setting should always override the property from the config file.

Right now, one has to post-process the loaded Configuration object by searching for the TreeWalker module and casting down to DefaultConfiguration in order to set the "charset" property. This just doesn't look like the right way.


  • Lars Kühne

    Lars Kühne - 2008-05-21

    Logged In: YES
    Originator: NO

    Please read the "property expansion" part in http://checkstyle.sourceforge.net/config.html

    This means that instead of hard coding the charset in the config file, you can specify a placeholder that is defined outside the configuration file, i.e. <property name="charset" value="UTF-8"/> becomes <property name="charset" value="${project.file.encoding}"/>.

    Defining property values for such symbolic names is supported at least by the checkstyle Ant task and the eclipse-cs plugin, and probably most other frontends as well.

    In the API, this mechanism is reflected by the PropertyResolver interface. If you write your own frontend (do you?), this is what you need to implement and pass into the ConfigurationLoader.

  • Benjamin Bentmann

    Logged In: YES
    Originator: YES

    Of course that could work but I consider the approach via the property expansion still conceptually wrong: The usage/enforcement of a particular file encoding would depend on a properly crafted Checkstyle config. Also, the frontend would need to know what particular property name is used in the XML file to pass the encoding value in. This all adds dependencies where none should be. I was seeking for a solution that would allow to enforce file encoding X with any Checkstyle XML config.

    If you write your own frontend (do you?)
    I was working on the Maven Checkstyle Plugin and just committed my previously mentioned hack to configure the file encoding independently of the used Checkstyle config:

  • Roman Ivanov

    Roman Ivanov - 2016-02-27
    • status: open --> closed
    • Group: --> Future

Log in to post a comment.