Some languages or language constructs are strictly positionned at the start of lines. The identification pattern must then be anchored at the start of the line with ^ (caret). Unhappily, the parser does not pass lines but chunks beginning where the last fragment ended. The caret anchors at the start of the chunk, which is rarely the start of the line.
Consequently, anchored syntactic categories will be rarely correctly recognised. A false positive is generated every time the start pattern is detected at the beginning of the chunk.
Suggested fix: when initialising the parser, regexps starting with a caret are modified to match an "impossible" character after the caret and that same "impossible" character is preprended to the line when it is read. The extra character can be stripped during line numbering in the HTML generation process.