#23 Wrong start_label in -c mode

closed-fixed
None
5
2009-01-02
2008-12-31
No

Re: Bug #2462777: Infinite loop in generated code with -c

There is a bit of confusion about the meaning of start_label in DFA::emit() when using -c.

The global start label is the point immediately before the code generated by genCondGoto() (YYGETCONDITION+goto condition).
This is where re2c:startlabel should be and it should be the default target for genGetStateGoto() when state==-1.

The local start label is the initial state for the condition of the DFA that happens to be emitting the prolog.
Sometimes the initial state is not emitted first, and yy0 is not the first label in the emitted code.
In this case, the current -c code will generate an infinite loop because yy0 has been placed at the global start label.

The attached patch does the following:

* Allow Initial::emit() to generate the proper local start label in -c mode
* Replace hard-coded references to labels 0 and 1 with start_label{,+1}
* Create a separate prolog_label=0 to represent the global start label.

The prolog_label must be called yy0 because the call to genGetStateGoto(default=yy0) in scanner.re
Unfortunately this offsets all label numbers by one in -c code, causing huge diffs for all the -c test cases.

With this patch re2c generates correct code for Bug #2462777:

./re2c -ci --no-generation-date tc.re << END
> /*!re2c
> <X> [a]* {x;}
> */
> END
/* Generated by re2c 0.13.6.dev */

{
YYCTYPE yych;
switch (YYGETCONDITION()) {
case yycX: goto yyc_X;
}
/* *********************************** */
yyc_X:
goto yy1;
yy2:
++YYCURSOR;
yy1:
if (YYLIMIT <= YYCURSOR) YYFILL(1);
yych = *YYCURSOR;
switch (yych) {
case 'a': goto yy2;
default: goto yy4;
}
yy4:
{x;}
}

Discussion

  • Small correction: next_label should only be bumped once, not once per condition. Check for bProlog:

    @@ -1783,6 +1783,16 @@ void DFA::emit(std::ostream &o, uint& ind, const RegExpMap* specMap, const std::
    bUsedYYAccept = false;
    }

    + // In -c mode, the prolog needs its own label separate from start_label.
    + // prolog_label is before the condition branch (GenCondGoto). It is equivalent to startLabelName.
    + // start_label corresponds to current condition.
    + // NOTE: prolog_label must be yy0 because of the !getstate:re2c handling in scanner.re
    + uint prolog_label = next_label;
    + if (bProlog && cFlag)
    + {
    + next_label++;
    + }
    +
    uint start_label = next_label;

    (void) new Initial(head, next_label++, bSaveOnHead);

    Test case diffs are much smaller if you first run

    perl -i -pe 's/yy(\d+)/"yy".($1+1)/eg' test/*.c*.c test/yyaccept_missing.bci.c

     
  • Marcus Börger
    Marcus Börger
    2009-01-02

    • assigned_to: nobody --> helly
    • summary: Fix start_label confusion in -c mode --> Wrong start_label in -c mode
    • status: open --> pending-accepted
     
  • Marcus Börger
    Marcus Börger
    2009-01-02

    • status: pending-accepted --> closed-fixed