Menu

FirstCodeGenerator

Tom Harwood

First Example Compiler: Code Generator

Overview

The code generator has four subsystems, all contained within src/jburg/tutorial/first/first.jbg:

  • Reduction Selection: This is the primary code generator subsystem; it is formally known as a Bottom-Up Rewrite Machine, or BURM. The BURM works from the bottom of the tree (the leaves, i.e., nodes with no child nodes) towards the root. At each level, the BURM performs a pattern match on the tree to determine the best available reductions to all feasible goal states; these reductions feed the parent node's pattern match to eventually compute an optimum sequence of reductions to rewrite the input tree.
  • Reducer: The reducer is the action code that actually performs the tree rewrite.
  • Back End Management: This code generator's back end manager is a simple TemplateManager.
  • Semantic Analysis: Even this rudimentary compiler needs some semantic analysis. There are several semantic analysis tasks:
    • Register Allocation: register allocation in this compiler is very limited.
    • Stack Management: local variables are defined as stack offsets. Stack management is simplified because the variables are all 4-byte integers.
    • Data Segment Management: most of the compiler's output is executable program text, but constant declarations (i.e., string constants) go into the data section.

Code Generator Preamble

The preamble consists of various directives. Of note are:

package jburg.tutorial.first;

Declares the code generator's package, consistent with the package of the front end and the driver.

header {
    import org.antlr.runtime.tree.*; // etc.

Import the classes the code generator uses. Also note:

    import static jburg.tutorial.first.TokenTypes.*;

The build system generates a TokenTypes abstract class from ANTLR's .tokens file; the code generator could refer to these as, e.g., TokenTypes.PLUS, but it's easier to read the specification if they're simple identifiers. The deprecated implements directive performed a similar function for the values in the ANTLR2-generated interface.

INodeType CommonTree;
INodeAdapter jburg.burg.inode.Antlr3JavaAdapter;

The front end uses ANTLR's builtin tree generation facilities to generate trees made up of CommonTree nodes. JBurg has a builtin [INodeAdapter](INodeAdapter) class that generates appropriate method calls to get the nodes' type, number of childen, and n'th child.

ReturnType Object;
ReturnType intConstant = Integer;
ReturnType name = String;

The code generator generates MIPS/SPIM assembly using StringTemplate objects, and StringTemplate objects' attributes are of type Object; so, for the most part, the particular type of object that a nonterminal class reduces to doesn't matter.

For integer constants and for identifiers, though, it's more convenient to use a representation that's natural, and so these nonterminal types have more specialized return types.

The next block of Java code, which begins with the comment This "inclass" block is copied verbatim into the generated tree parser/code generator class body, contains the semantic analysis code and the TemplateManager instance that serves as the compiler's back end.

Rules

The first set of pattern rules recognize the ASTs for this language's three statements, and generate code or, in the case of the variable declaration, populate the variable name table:

// Declare a local variable.
// This rule returns null, because it allocates
// stack space for the local as a side effect.
statement = INT_TYPE(name name): 1
{
    this.declareVariable(name);
    return null;
}

Note that the statement-level reductions that emit executable code also call resetTemps(), which resets the temporary register allocator as a side effect.

// Assign a value to a local variable.
statement = EQUALS(name lvalue, intExpression rvalue): 1
{
    return resetTemps(templateManager.getTemplate("storeLocal", "offset", offsetOf(lvalue), "rvalue", rvalue));
}

Contrast this to an expression-level pattern, which allocates a temp register for the expression's result:

intExpression = PLUS(intExpression lhs, intExpression rhs):1
{
    return allocateTemp(templateManager.getTemplate("add", "lhs", lhs, "rhs", rhs));
}

Since the number of temporary registers is limited, and the temporary register pool only resets at the statement level, the number of operators in an expression has to be small or the register allocator will run out of registers. This of course would not be the case in a more realistic compiler.

This is an example of a general transformational rule: the code generator will select this rule to transform a name -- which is already the result of a successful pattern match on ID(void) -- and further transform the name into an intExpression by loading the local variable into a register.

intExpression = name: 1
{
    return allocateTemp(templateManager.getTemplate("loadLocal", "offset", offsetOf(name)));
}

| First Compiler Intro | Front End | Code Generator | Templates |


Related

Wiki: DirectivesGuide
Wiki: FirstCompiler
Wiki: FirstFrontEnd
Wiki: PatternRuleGuide
Wiki: SecondCodeGenerator
Wiki: TransformationRuleGuide
Wiki: TutorialCompilerTemplates