#1528 Perl: multibyte characters breaks POD

Bug
closed-fixed
5
2013-10-15
2013-09-18
No

The Perl lexer still have some issues related to character vs. byte, especially in POD verb handling.

See [geany:bugs:#995]. Also related to [#1483] of course.

Related

Geany: Bugs: #995
Bugs: #1483

Discussion

  • Kein-Hong Man

    Kein-Hong Man - 2013-09-19

    I want to test the HereDoc and POD changes a bit. When cutting-and-pasting some utf-8 to some POD examples, SciTE locked up.

    As for the rest, thanks, I guess it's better to code it correctly. I was being indisciplined and assumed utf-8 and the supported DBCS charset characteristics. :-p

     
  • Kein-Hong Man

    Kein-Hong Man - 2013-10-09

    0002-Perl-fix-a-few-suspicious-characters-vs-bytes-movements

    This patch is okay. The last 5 changes in the patch uses the correct call, while previously I had assumed the operators/whitespaces single-byte encoding in DBCS and UTF-8 holds.

    The first change also makes the correct byte-oriented call, but needs a few additional lines so that HEREDOC quoted delimiters is highlighted correctly for extended charsets. Runtime-wise, a utf-8 HEREDOC quoted delimiter works on Cygwin perl, didn't try DBCS though. Unquoted ones didn't work properly.

    The attachment contains the additional changes for handling HEREDOC quoted delimiters, plus some test cases.

     
  • Kein-Hong Man

    Kein-Hong Man - 2013-10-10

    0001-Perl-fix-handling-of-PODs-containing-multi-byte-characters

    After patching, lock up occurs when scrolling down file with POD (attached). No lock up when jumping to end of file.

    The failure mode is like this: currentPos=1385 endPos=1387 fw=1388 So sc.ForwardBytes() gets fw-currentPos => 3 but since the POD forward scanner generates positions where fw > endPos, sc.ForwardBytes() gets locked into an infinite loop.

    I'm fixing the POD forward scanner and preparing multibyte tests now.

     
    • Kein-Hong Man

      Kein-Hong Man - 2013-10-10

      0001-Perl-fix-handling-of-PODs-containing-multi-byte-characters

      Fix for POD scan lockup, plus some test cases. Didn't touch sc.ForwardBytes(), but it's a potential lockup hazard.

       
      • Neil Hodgson

        Neil Hodgson - 2013-10-10

        Committed as [658206].

        It may be safe to stop ForwardBytes at endPos but there may be unforeseen effects so likely better to make that change after the release of 3.3.6.

         

        Related

        Commit: [658206]

  • Neil Hodgson

    Neil Hodgson - 2013-10-10
    • status: open --> open-fixed
    • assigned_to: Neil Hodgson
     
  • Neil Hodgson

    Neil Hodgson - 2013-10-15
    • status: open-fixed --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks