Menu

#2225 PHP parsing including dot in constant names

Bug
closed-fixed
nobody
5
2021-09-22
2020-12-09
No

Reported in https://github.com/notepad-plus-plus/notepad-plus-plus/issues/9229#issuecomment-741669127 as a bug in Notepad++

Parsing of arbitrary text which PHP will attempt to treat as a constant label is greedily including any following dot. Since the dot is a PHP language operator (concatenation) this breaks syntax highlighting not only of the dot (appears as part of the constant name) but anything following to be concatenated, such as function names, if there is no space (syntactically correct in PHP). Other punctuation works as expected.

To reproduce (syntax highlights incorrectly in Notepad++):

<?php

echo WWWBASE.substr($a, 1);

?>

Adding a space before the dot fixes highlighting, otherwise the dot operator and the substr function are not recognised correctly. The constant name parsing should stop at the dot, not include it.

Discussion

  • Zufu Liu

    Zufu Liu - 2020-12-09
    • labels: --> lexer, scintilla, html, php
    • status: open --> open-accepted
     
  • Zufu Liu

    Zufu Liu - 2020-12-09

    A simple fix for LexHTML.cxx line 2277:

            case SCE_HPHP_WORD:
    -           if (!IsAWordChar(ch)) {
    +           if (ch == '.' || !IsAWordChar(ch)) {
    
     
  • Neil Hodgson

    Neil Hodgson - 2020-12-09

    Needs a formal unit test in lexilla/examples/hypertext.

     
  • Neil Hodgson

    Neil Hodgson - 2021-08-02
    • status: open-accepted --> open-fixed
     
  • Neil Hodgson

    Neil Hodgson - 2021-08-02

    Fix committed with Lexilla issue #22.
    https://github.com/ScintillaOrg/lexilla/issues/22

     
  • Neil Hodgson

    Neil Hodgson - 2021-09-22
    • status: open-fixed --> closed-fixed
     

Log in to post a comment.