Menu

Develop-Xtext Implementation

Developers - Xtext Implementation

Disclaimer: This page is intended to be a reference for those interested in how the language is implemented in the Xtext framework. It is not intended to be understood by beginning users of the language.

With any Xtext project, the very first thing to do is to define a grammar. The grammar is constructed of parser and lexer rules. Lexer rules perform the grouping of individual characters into "tokens" that the parser rules can recognize. The parser rules then form an in-memory model called an Abstract Syntax Tree (AST) or model graph (with reference crosslinks) that the Xtext and Eclipse framework can understand. Lexer rules are indicated by the keyword "terminal" suggesting they form the smallest quantifiable grouping of characters possible in the continuous sequence of characters from the resource (source file).

Devices

We start with some of the easier rules of the grammar, and where we started in the tutorials: the rules that make up a device declaration:

Device: 
    'device' name=PhdlID '{' elements+=DeviceElement* '}';

DeviceElement:
    Attr | Pin | Info;

These rules specify that a device declaration must begin with the keyword device, followed by a name, which is lexed as a special type of identifier, called a PhdlID. Any number of DeviceElements exist between braces, and are added to a container called "elements" that the device has stewardship over. Note the inheritance rule DeviceElement: in addition to a delegating rule call to either an Attr (attribute), Pin, or Info, it also forms an "is-a" relationship within the container of elements (i.e., an Attr is-a DeviceElement). A PhdlID is a "catch-all" for all of the identifier, and integer rules. It discovers any of the three lexer rules (denoted by the keyword "terminal") and returns a string in all cases (even if it discovers an integer!)

PhdlID: 
    INT | ID | PINNUM;

terminal ID:  '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ;
terminal INT returns ecore::EInt: ('0'..'9') | (('1'..'9') ('0'..'9')+ ) ;
terminal PINNUM: ('0'..'9'|'a'..'z'|'A'..'Z'|'_'|'+'|'-'|'$'|'/'|'@')+ ;

The ID rule implements the classic identifier rule for many languages. It states that an ID can have any number of letters (lower-case or upper-case), numbers, or underscores, but may NOT begin with a number. The carrot is added as an optional start character to allow use of keywords in the language (the actual string value is returned without the carrot).

The INT rule returns a validated integer based on the ecore modelling specification (see the Xtext docs). It states that an integer in PHDL may be any number of digits, but may not begin with a zero (if there is more than one digit). Since integer values only specify array indices in PHDL, we have no need for negative numbers.

The PINNUM rule catches anything that doesn't match the ID, or INT rule, and throws in a few extra special characters that some back-end layout tools allow or require.

Device Attributes

The Attr (Attribute) rule:

Attr:
    'attr' name=ID '=' value=STRING ';';

states that an attribute must begin with the keyword attr followed by a name (obeying the ID rule), an equals sign, and a value in the form of a string. The attribute is then terminated in a semicolon. A string in PHDL is defined by the lexer rule:

terminal STRING : 
    '"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|'"') )* '"' |
    "'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|"'") )* "'";

We have opted to define a string wrapped in either single or double-quotes (but not both). This means that if you need to type a lot of quotes inside the string, you can use the opposite quote deliminter to set off the entire string.


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.