Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#1 What's the conflict

Jochen Wiedmann


when compiling the attached simple grammar with Beaver,
I get an error message, that it contains a shift-reduce
conflict. I have no idea, what might be the reason.
I'll also attach the RDF XML Schema, which is
implemented by the grammar.




  • RDF grammar

    • status: open --> closed
  • Logged In: YES

    The problem with this grammar is in one of the defintions of
    of nonterminal "any", namely TEXT+. By itself TEXT+ is
    harmless, but when used by "any", which can be part of
    "any*", it creates a problem.

    To better illustrate the issue, let us assume for the moment
    that it really was "any+". This can be expanded, for some
    special case as: any any any. Now each of those will match
    TEXT+, i.e. when expanded further the "rule" will look like:
    TEXT+ TEXT+ TEXT+. Now, when parser faces the stream of TEXT
    tokens when should it stop, return the first list and switch
    to making the second TEXT+ list (of the second "any")?
    Without additional hints there is no way to determine this.

    That hint is a precedence information - what rules should be
    preferred when otherwise both can be used. Returning to the
    "expanded" grammar: if scanner started to feed parser with a
    stream of TEXT, i.e. ... TEXT TEXT TEXT TEXT TEXT TEXT...
    after the first TEXT is consumed and added to TEXT+ list two
    alternatives are equally possible (because of the way "any"
    nonterminal is used): the second TEXT can be added to the
    same list, or parser can return ther first list (with only
    one TEXT terminal) and start another list. We can resolve
    with tie by instructing a parser that for such input we
    would prefer to add all the TEXTs to a single list.

    To do this we need to mention TEXT in a precedence
    declaration, and introduce a precedence terminal for a list
    consuming rule:

    %nonassoc TEXT; // TEXT has highest (in this grammar)
    precedence, i.e.
    // parser will always prefer to shift
    TEXT, a.k.a. "build
    // list of TEXTs" in this grammar
    %nonassoc LSTGET; // a "virtual" terminal that has lower
    precedence than TEXT

    = START_RDF any* END_RDF

    = TEXT+ @ LSTGET
    | START_RDF any* END_RDF

    This grammar knows that building a list of TEXT is the
    preferable action, so Beaver will be able to build a parser
    for it.

    P.S.: while looking back at this example I realized that
    currently Beaver makes you define a "virtual" nonterminal
    just to assign a precedence to a rule. I think it may be
    interesing to implicitly assign the lowest precedence to all
    rules that do not have explicit precedece assignment or
    cannot derive it from the rightmost terminal. As a result we
    won't need LSTGET "terminal". Look for it in :-)

  • Logged In: YES

    RDF grammar altered for v0.9.2.2:

    %nonassoc TEXT;

    = START_RDF any* END_RDF

    = TEXT+
    | START_RDF any* END_RDF

  • Logged In: YES

    I've realized that I provided an explnation starting from
    the middle :-) Now, lets start from the beginning:

    For the original RDF grammar compiler reported a
    shift-reduce conflict. Or in other words, there are two
    _equally_ possible actions after parser reached certain
    point in parsing and scanner returned next token.

    Lets consider a parser for the original grammar. The very
    first TEXT token will be consumed by an initial "list of
    TEXTs" rule, which will create a a list and add that TEXT to
    it. Now, because one of the "any" alternatives is a "list of
    TEXTs", _and_ "any" can contain a "list of any", what should
    parser do if the next toen is TEXT again? It can add it to
    the original list (SHIFT it in the "list of TEXTs" building
    rule), or it can deside that the list we already have (with
    a single TEXT) is good enough, REDUCE it to "any" and start
    building next element in a "list of any". And so on, where
    each "any" in a "list of any" will be a "list of TEXT" with
    a single TEXT in it.

    Beaver does make a decision which one of these two is a
    prefereable action, so you need to hint it, by assigning
    precedence to a TEXT terminal. With higher precedence
    assigned to TEXT, parser will prefer shifting this terminal,
    hence a sequence of TEXT returned by a scanner will be added
    to a single list, which, only after it (the sequence of
    TEXT) ends, will be reduced to "any".