Menu

PatternRuleGuide

Tom Harwood

Pattern-Matching Rules

All subtree rewrites start with a pattern match. Pattern-matching rules are syntactically recognizable by their format:

goalState = NODE_TYPE(subgoal subtree): cost

The key feature that distinguishes a pattern-matching rule from other rules is the (subgoal subtree) section. All pattern-matching rules have this; pattern matches at leaf nodes specify their (non-existent) subtrees' goals as (void).

Goal (a.k.a. nonterminal) States

Every rewrite is done to produce a new kind of result; these results are grouped into classes known as goal states or nonterminal states. What a particular goal state means is up to you, as the programmer of the pattern matcher; the pattern matcher does not know anything about a goal state's meaning.

Examples of goal states typically found in a code generator:
statement
intExpression
intConstant
stringConstant

Subtrees and Subgoals

Any node that has children must match a pattern that has feasible and sensible goals for all the children. For example, an ADD node generally does single-mode arithmetic:

intExpression = ADD(intExpression lhs, intExpression rhs): cost

This pattern will match a node whose operator code is ADD, and which has two children, both of which can be reduced to intExpression results. The result of applying this rule will itself be an intExpression.

The subtrees' subgoals affect their parent's pattern match. For example, suppose we have two rules to add integers, one of which is an add immediate:

intExpression = ADD(intExpression lhs, intExpression rhs): cost1
intExpression = ADD(intExpression lhs, intConstant rhs): cost2

If a node's second subtree can be reduced to an intConstant and the constant is in the range of the target's add immediate range, then the code generator can be programmed to prefer a reduction that emits an addi instruction instead of an add.

Cost Factors

How does the code generator select the optimal rule? It totals up the cost factors for each viable alternative, including the cost of the subtrees' subgoals, and selects the least-cost alternative. (In case of a tie, the first rule specified wins.) For the example above, there are two ways to give the add immediate variant a lower cost:
1. Give the second rule a lower cost than the first rule.
2. Give the rule that converts an intConstant into an intExpression a greater aggregate cost.

In practice, both these techniques are likely to be employed.

Cost factors can be supplied in several different ways:
As integer literals, e.g., intExpr = ADD(intExpr terms+): 1
As JBurg.Constant values, e.g., intExpr = ADD(intExpr terms+): OrdinaryInstruction where OrdinaryInstruction has been declared as a JBurg.Constant value in the specification.
As cost functions, explained below. As references to field values or other values in the compiler's host programming language.

To produce a fast pattern matcher, specify the cost as an int literal or a JBurg.Constant wherever possible; this allows JBurg to perform more analysis at compiler-compiler time, which can result in dramatically better pattern matchers.


Related

Wiki: BigPicture
Wiki: FirstCodeGenerator
Wiki: JBurg Reference
Wiki: JBurg2 Reference
Wiki: JBurg2 Syntax