#230 Stop exposing implementation details from TokenTypes

open
nobody
5
2012-10-10
2003-06-16
Tim Tyler
No

The names in "TokenTypes" are not as human readable
as they might be.

At the top of that file it says: "Implementation detail".

However the names are exposed in Checkstyle's
interface - and appear in docs/checkstyle_checks.xml -
where users are exposed to them.

The result is a lot of arcane abbreviations that users
need to know - if they are to be able to read or write
XML config files.

At the moment these names are even propagating out
as far as Eclipse's Checkstyle user interface - where
they appear extremely ugly.

I propose giving them names - which are not under the
constraint of being all caps with underscore.

Names which correspond directly to unique Java code
elements could profitably become them:

"ASSIGN" => "="
"SEMI" => ";"
"LOR" => "||"
"LCURLY" => "{"
"STAR_ASSIGN" => "*="
"DOT" => "."
"PACKAGE_DEF" => "package"
"CLASS_DEF" => "class"
"RPAREN" => ")"

Identifiers and the like could be rendered like this:

"IDENT" => "Identifier"
"NUM_INT" => "Integer"
"NUM_FLOAT" => "Float"
"NUM_DOUBLE" => "Double"
"STRING_LITERAL" => "String"
"CHAR_LITERAL" => "Character"

Any entities incompatible with a list in an XML file
could behave similarly - for example if commas are
still seen as desirable list element separators, then
this would be needed:

"COMMA" => "Comma"

...and most other symbols which do not correspond to
textual objects directly in the file they come from could
be made more human-readable - e.g.:

"OBJ_BLOCK" => "ObjectBlock"
"SLIST" => "StatementList"
"ELIST" => "ExpressionList"
"EXPR" => "Expression"
"VARIABLE_DEF" => "VariableDefinition"
"EMPTY_STAT" => "EmptyStatement"

Then, use of the old names in the config files could
be deprecated.

To make the lives of the programmers easier, it would
be as good a time as any to make the internal names
match those the users see - at least where possible -
so - assuming the above naming scheme:

TokenTypes.CHAR_LITERAL would become
TokenTypes.CHARACTER,

TokenTypes.PACKAGE_DEF would become
TokenTypes.PACKAGE.

...and so on. Things like "ASSIGN" could remain.

The old names, "CHAR_LITERAL" and "PACKAGE_DEF"
could be added to the list of acceptable configuration
file aliases - while they are still deprecated - and left
in as deprecated defined values in TokenTypes.

No end user custom check code need break, and old
configuration files will still function properly - and the
change over to using the new names could be as
gradual as is desired.

Discussion


Log in to post a comment.