I want to test the HereDoc and POD changes a bit. When cutting-and-pasting some utf-8 to some POD examples, SciTE locked up.
As for the rest, thanks, I guess it's better to code it correctly. I was being indisciplined and assumed utf-8 and the supported DBCS charset characteristics. :-p
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This patch is okay. The last 5 changes in the patch uses the correct call, while previously I had assumed the operators/whitespaces single-byte encoding in DBCS and UTF-8 holds.
The first change also makes the correct byte-oriented call, but needs a few additional lines so that HEREDOC quoted delimiters is highlighted correctly for extended charsets. Runtime-wise, a utf-8 HEREDOC quoted delimiter works on Cygwin perl, didn't try DBCS though. Unquoted ones didn't work properly.
The attachment contains the additional changes for handling HEREDOC quoted delimiters, plus some test cases.
After patching, lock up occurs when scrolling down file with POD (attached). No lock up when jumping to end of file.
The failure mode is like this: currentPos=1385 endPos=1387 fw=1388 So sc.ForwardBytes() gets fw-currentPos => 3 but since the POD forward scanner generates positions where fw > endPos, sc.ForwardBytes() gets locked into an infinite loop.
I'm fixing the POD forward scanner and preparing multibyte tests now.
Patch to fix POD with multi-byte characters.
Patch to fix a few other suspicious uses of characters vs. bytes movements.
I want to test the HereDoc and POD changes a bit. When cutting-and-pasting some utf-8 to some POD examples, SciTE locked up.
As for the rest, thanks, I guess it's better to code it correctly. I was being indisciplined and assumed utf-8 and the supported DBCS charset characteristics. :-p
0002-Perl-fix-a-few-suspicious-characters-vs-bytes-movements
This patch is okay. The last 5 changes in the patch uses the correct call, while previously I had assumed the operators/whitespaces single-byte encoding in DBCS and UTF-8 holds.
The first change also makes the correct byte-oriented call, but needs a few additional lines so that HEREDOC quoted delimiters is highlighted correctly for extended charsets. Runtime-wise, a utf-8 HEREDOC quoted delimiter works on Cygwin perl, didn't try DBCS though. Unquoted ones didn't work properly.
The attachment contains the additional changes for handling HEREDOC quoted delimiters, plus some test cases.
Committed as [630f58].
Related
Commit: [630f58]
0001-Perl-fix-handling-of-PODs-containing-multi-byte-characters
After patching, lock up occurs when scrolling down file with POD (attached). No lock up when jumping to end of file.
The failure mode is like this: currentPos=1385 endPos=1387 fw=1388 So sc.ForwardBytes() gets fw-currentPos => 3 but since the POD forward scanner generates positions where fw > endPos, sc.ForwardBytes() gets locked into an infinite loop.
I'm fixing the POD forward scanner and preparing multibyte tests now.
0001-Perl-fix-handling-of-PODs-containing-multi-byte-characters
Fix for POD scan lockup, plus some test cases. Didn't touch sc.ForwardBytes(), but it's a potential lockup hazard.
Committed as [658206].
It may be safe to stop ForwardBytes at endPos but there may be unforeseen effects so likely better to make that change after the release of 3.3.6.
Related
Commit: [658206]