Menu

#46 re2c generates an infinite loop, depends on existence of previous parser

0.15
closed-fixed
None
5
2015-06-04
2014-06-14
No

This example program causes re2c generate a parser that is stuck in an infinite loop.

#include <string.h>
#include <stdio.h>

int main(int argc, char** argv)
{
    char* YYCURSOR = argv[1];
    char* YYLIMIT  = YYCURSOR + strlen(YYCURSOR);
    #define YYFILL(n) do { } while(0)

/*!re2c
re2c:define:YYCTYPE = "unsigned char";

""  { } /* dummy rule, must exist for the bug to occur; actual rule is not important */
*/

/* BUG BEGINS HERE */
/*!re2c
[^abc]* "a"? { printf("exit 0\n"); return 0; }
*/

    printf("exit 1\n");
    return 0;
}

Expected outcome: On any input, the program prints "exit 0"
Outcome: On any input that does not begin with "a", "b" or "c", the program gets stuck in an infinite loop.

The existence of the previous parser (the "dummy rule" above) is important. Without it, the bug does not trigger.

Tested re2c versions: 0.13.5
re2c commandline options: Happens with any options, including -b, -s, -w, -u, and without.

Note: This is a simplified example rule to trigger the bug, and may look pointless. Suggested workarounds for that simplified example rule will not be appreciated. The actual bug was discovered using this rule: [^\n\032] ("\n" [ \n\t])? -- which accepts anything until a newline or an EOF is encountered; if a newline was encountered, any consecutive whitespace and newlines are also accepted.

Discussion

  • Joel Yliluoma

    Joel Yliluoma - 2014-06-14

    Well, sourceforge apparently converted the * into an italic in the last paragraph. That is not important though.

    For reference, here's the code that re2c generated. The "goto yy4;" is where the infinite loop happens.

    /* Generated by re2c 0.13.5 on Sat Jun 14 10:20:37 2014 */
    #line 1 "test.cc.re"
    #include <string.h>
    #include <stdio.h>
    
    int main(int argc, char** argv)
    {
        char* YYCURSOR = argv[1];
        char* YYLIMIT  = YYCURSOR + strlen(YYCURSOR);
        #define YYFILL(n) do { } while(0)
    
    #line 14 "<stdout>"
    {
        unsigned char yych;
    
    #line 13 "test.cc.re"
        { }
    #line 20 "<stdout>"
    }
    #line 14 "test.cc.re"
    
    /* BUG BEGINS HERE */
    
    #line 27 "<stdout>"
    {
        unsigned char yych;
    yy4:
        if (YYLIMIT <= YYCURSOR) YYFILL(1);
        yych = *YYCURSOR;
        switch (yych) {
        case 'a':       goto yy7;
        case 'b':
        case 'c':       goto yy6;
        default:        goto yy4;
        }
    yy6:
    #line 18 "test.cc.re"
        { printf("exit 0\n"); return 0; }
    #line 42 "<stdout>"
    yy7:
        ++YYCURSOR;
        yych = *YYCURSOR;
        goto yy6;
    }
    #line 19 "test.cc.re"
    
        printf("exit 1\n");
        return 0;
    }
    
     
  • Ulya Trofimovich

    Seems to be fixed in re2c-0.13.6 or later

     
  • Ulya Trofimovich

    • status: open --> closed-fixed
    • assigned_to: Ulya Trofimovich
     

Log in to post a comment.