Menu

Reusage and standalone usage of DBC parser

Peter Vranken

Reusage of code, standalone use of DBC parser and compatibility

Table of contents

Reusing the parser implementation in other Java applications

comFramework's code generator includes a parser for CAN network
databases, i.e. files of the DBC format, the typical file name extension
is .dbc. The availability of open source implementations for DBC file
parsers is poor (you might have a look at
https://sourceforge.net/projects/cantools for an implementation in C) and
people may be interested in reusing the DBC parser of comFramework.

The implementation of the parser is done in Java 18 (byte code version 62)
using the parser generator antlr4 (http://www.antlr.org/). The parser
is implemented as Java package codeGenerator.dbcParser and this package
can basically be integrated in any other Java application. The application
interface of the parser mainly is the return of the parse tree. Although a
powerful operation, which requires more than 5000 lines of code, it
doesn't really bring a strong added value to your application since far
the most of this code is just generated from the antlr4 grammar file.

The only additional reusable piece of code is the class
SemanticCheckListener, which performs a (simple) semantic validation of
the parse result. The antlr4 grammar does not check all static elements
of the syntax definition. Forcing the grammar to do all of this would
sometimes be too cumbersome. To give a single example: For obscure
reasons, the name of an attribute is defined to be an identifier enclosed
in double quotes - a syntax construct, which conflicts with the common
character string. The grammar accepts any double quote enclosed character
string for attribute name and only SemanticCheckListener will validate
that the characters indeed form an identifier. A number of similar checks
can be found in SemanticCheckListener, too.

The code generator application needs a lot more of semantic checks. The
implementation of these checks requires the same complex data structures
for organizing and handling the parsed information as the application
anyway needs. Therefore these checks are implemented as implicit part of
the application itself; the checks are a kind of side effect of the build
up of the required data structures. The code can be found as Java class
implementation file
codeGenerator/dataModelListener/DataModelListener.java. It won't be
reusable as such in your Java application but it can serve as valuable
documentation of how to do the semantic interpretation of the antlr4
delivered parse tree.

Summarizing, the following suggestions can be given:

  • The complete package codeGenerator.dbcParser can be integrated in your
    Java 7 (version 1.7) application but it will probably not pay off
  • The grammar file codeGenerator/dbcParser/Dbc.g4 can be reused and
    it'll pay off. Your application will integrate antlr4 into the build
    process to generate code from this file
  • You will probably reuse the class implementation file
    codeGenerator/dbcParser/SemanticCheckListener.java if you decide to
    reuse the grammar file
  • The syntax definition of DBC files is incomplete and buggy. (See Vector
    Informatik, DBC_File_Format_Documentation.pdf) You can use the grammar
    file Dbc.g4 as a documentation of many of the inconsistencies, gaps
    and pitfalls
  • You can use the Java class implementation DataModelListener.java as
    documentation of required semantic checks

Using the code generator as standalone DBC parser

The code generator can be used as a simple DBC parser. The idea holds for
interpreted languages only: Let the code generator parse the DBC file and
let it render the information in the syntax of the interpreter's language,
then let the interpreter read the generated file. The generated file
obviously is a temporary file only.

Although not really an elegant approach, in practice it has the advantage
of scalability. In many applications not the full information from the
input file will be needed, in which case information rendering can become
a matter of a StringTemplate V4 template of only a few lines. Whenever the
demands of the application rise you will just have to elaborate your
template accordingly.

Example: The code generator as DBC parser for GNU Octave (*.m)

In GNU Octave, such a DBC parser can be implemented in a simple *.m
script. The code depends on the environment; the example would be valid
for a Windows system:

function dbcFile = dbcParser(dbcFileName, nodeName)
% Parse CAN network database dbcFileName and return the parse result.
% nodeName: The name of the ECU. Needed to determine the send direction.
    tmpFileName = tmpnam;
    templateGroupFileName = 'dbcParserForOctave.stg';
    cmd =                                                      ...
    [ 'codeGenerator'                                          ...
      '  --cluster-name Octave'                                ...
      '  --node-name ' nodeName                                ...
      '  --bus-name bus'                                       ...
      '  -dbc ' dbcFileName                                    ...
      '  --output-file-name ' tmpFileName                      ...
      '    --template-file-name ' which(templateGroupFileName) ...
    ];
    % Run the comFramework code generator as an external tool.
    system(cmd);
    % run will create object dbcFile in the function's workspace.
    run(tmpFileName);
    % The parse result as a *.m file is no longer required.
    delete(tmpFileName);
end % dbcParser

Please note, error handling has been omitted in order to not obscure the
essence of the sample code. To successfully run the example the script
codeGenerator needs to be in the system search path and the
StringTemplate V4 template group file dbcParserForOctave.stg needs to be
in the Octave search path.

What's still missing is the referenced template file. A possible file
could be:

renderCluster(cluster,info) ::= "<renderBus(first(cluster.busAry))>"
renderBus(bus) ::= <<
dbcFile.database = '<bus.networkFile>';
dbcFile.frameAry = repmat(struct,<length(bus.frameAry)>,1);
<bus.frameAry:{frame|<first(frame.pduAry):renderPdu()>}>
>>

renderPdu(pdu) ::= <<
dbcFile.frameAry(<frame.i>).name = '<pdu.name>';
dbcFile.frameAry(<frame.i>).id = '<pdu.id>';
dbcFile.frameAry(<frame.i>).signalAry = repmat(struct,<pdu.noSignals>,1);
<pdu.signalAry:renderSignal()>
>>

renderSignal(s) ::= <<
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).name = '<s.name>';
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).type = '<s.type>';
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).startBit = <s.startBit>;
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).length = <s.length>;
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).factor = <s.factor>;
dbcFile.frameAry(<frame.i>).signalAry(<s.i>).offset = <s.offset>;<\n>
>>

This example is still very basic but it should be sufficient to
demonstrate the technique and maybe it could even serve as a useful
starting point: all frames and their signals with the most relevant
properties are returned to Octave as a nested struct object.

The implementation of boths files depends on the target environment but
will be quite similar for Perl, Python, VB, etc.

Compatibility

The compatibility of the parser with real existing CAN database *.dbc
files is an important issue. The implementation is based on the
specification Vector Informatik, DBC_File_Format_Documentation.pdf. The
specification is poor and no better one is available. It is incomplete and
has several inconsistencies. The implementer of a parser has to decide at
many points how his implementation should behave.

It makes these decisions even more difficult that many real existing
*.dbc files don't even comply with those rules, which are doubtless
specified in this document. Here are some typical errors:

  • Attribute names do not match the syntax definition of an identifier
  • The values of attributes are out of range or have the wrong type
  • Attributes don't have a default value
  • Enumeration values have negative numeric values

This parser's implementation tends to report all of these (and more)
issues as a warning only but it is stricter as soon as a syntax problem
could cause trouble in the given context of the intended code generation.
As an example, it is common practice to use the attribute name as a
variable or struct member name in the generated code. This would fail if
the attribute name is a non-identifier string already in the input;
consequently, the parser stops with error when encountering an attribute
name like my@tribute instead of myAttribute. Contrary to this, a
negative enumeration value is likely uncritical in the context of later
code generation and is reported as a warning only.

The parser has been tested with a lot of input files. Due to the reasons
discussed before there was a low but not insignificant percentage of
files, which were rejected by the parser. Usually, it's straightforward to
repair the refused file based on the parser's error feedback. Please let
us know if your file causes problems.


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.