This example program causes re2c generate a parser that is stuck in an infinite loop.
#include <string.h> #include <stdio.h> int main(int argc, char** argv) { char* YYCURSOR = argv[1]; char* YYLIMIT = YYCURSOR + strlen(YYCURSOR); #define YYFILL(n) do { } while(0) /*!re2c re2c:define:YYCTYPE = "unsigned char"; "" { } /* dummy rule, must exist for the bug to occur; actual rule is not important */ */ /* BUG BEGINS HERE */ /*!re2c [^abc]* "a"? { printf("exit 0\n"); return 0; } */ printf("exit 1\n"); return 0; }
Expected outcome: On any input, the program prints "exit 0"
Outcome: On any input that does not begin with "a", "b" or "c", the program gets stuck in an infinite loop.
The existence of the previous parser (the "dummy rule" above) is important. Without it, the bug does not trigger.
Tested re2c versions: 0.13.5
re2c commandline options: Happens with any options, including -b, -s, -w, -u, and without.
Note: This is a simplified example rule to trigger the bug, and may look pointless. Suggested workarounds for that simplified example rule will not be appreciated. The actual bug was discovered using this rule: [^\n\032] ("\n" [ \n\t])? -- which accepts anything until a newline or an EOF is encountered; if a newline was encountered, any consecutive whitespace and newlines are also accepted.
Well, sourceforge apparently converted the * into an italic in the last paragraph. That is not important though.
For reference, here's the code that re2c generated. The "goto yy4;" is where the infinite loop happens.
Seems to be fixed in re2c-0.13.6 or later