Tracker: Patches

6 fix perl edit mode bug (bug #2808363) - ID: 2827234
Last Update: Settings changed ( vanza )

The problem:

Trying to load the following example, and perfectly legal and working Perl program will freeze jEdit. I estimate that with the current regular expression jEdit uses, it would take several CPU years for the lookingAt method to parse the transliteration line of this simple, legal and working Perl test program:

test.pl:
#!/usr/bin/perl
$a = "banana";
print "$a\n";
$a =~ tr{a}{b}; # I like the letter a but not b so translate every a to b
print "$a\n";

If you look over on the bug #280363 thread, you can see many people encounter the same problem (although my Perl example is simplified from the real world, in the field statement that discovered the bug).

Anyhow, the problem is the regular expression that deals with transliteration in jEdit's perl.xml file is missing a question mark. It uses a greedy quantifier instead of a reluctant quantifier. Since java.util.regex uses a Nondeterministic Finite Automaton pattern matching engine, what results is a geometric progression. Thus, every new letter added to say a comment at the end of this statement doubles the amount of processing time of the parsing. As I said before, certain lines (like the one above) I estimate would take several CPU years to complete.

My patch simply adds one character to one file in jEdit - a missing question mark. It is added in the regular expression dealing with this transliteration in the perl.xml file. The question mark changes the greedy quantifier to a reluctant quantifier. This has absolutely no effect on the boolean return statement that the Matcher class's lookingAt statement returns. It simply gets rid of the pointless greedy quantifier, so that the lookingAt method does not hit the geometric progression with a scale factor of 2 when dealing with clean Perl code.

The bug #2808363 thread illuminates some of this, as does the Sun documentation for the Pattern and Matcher classes. This article from Javaworld is pretty helpful in terms of dealing with funky regular expressions from the Java library used - http://www.javaworld.com/javaworld/jw-09-2007/jw-09-optimizingregex.html?page=1

I should note that this patch does not solve every problem related to the geometric progression of cpu processing for this regular expression. It only solves it when the Perl code is good, like in my above example, or in the file attached to bug #2808363 that hit this bug in the field. My one-character patch solves the problem for these two scenarios. However, if the Perl code being edited is bad, with improper backslashes in the second field of tr curly brackets, which would cause the lookingAt method to return false - then the jEdit geometric progression processing bug is still hit. With the existing jEdit code, jEdit can hang for several cpu years whether the boolean return of lookingAt is true or false. Ths patch cancels that geometric progression if lookingAt is true (good Perl code with no improper backslashes in the second curly bracket transliteration field) but not if lookingAt is false (Perl code is broken, bad backslashes in the second curly bracket transliteration field).


Dennis Sheil ( dennis_sheil ) - 2009-07-25 19:44:52 PDT

6

Closed

Accepted

Marcelo Vanzin

texteditor

None

Public


Comments ( 4 )

Date: 2009-08-26 22:19:59 PDT
Sender: vanza

Checked in rev #16080.


Date: 2009-08-23 10:31:00 PDT
Sender: ezustProject AdminAccepting Donations

Assigning to Marcelo, who has done work on the perl mode before.



Date: 2009-08-22 21:33:24 PDT
Sender: dennis_sheil

As I mentioned on the bug page, this bug affects programs that use the y
operator as well as the tr operator. I updated the patch (with the latest
svn version) so that the problem behavior is patched for not only the
reported tr operator, but for the y operator.


Date: 2009-07-25 19:50:53 PDT
Sender: dennis_sheil

Sourceforge word wrap happens before 74 characters apparently, so you can
use the below program to test if you want (I just remove the last space
letter from the comment on line 4)

test.pl:
#!/usr/bin/perl
$a = "banana";
print "$a\n";
$a =~ tr{a}{b}; # I like the letter a but not b so translate every a
print "$a\n";


Attached File ( 1 )

Filename Description Download
perl.patch Patches reported bug when in Perl mode Download

Changes ( 8 )

Field Old Value Date By
status_id Open 2009-08-26 22:20:00 PDT vanza
close_date - 2009-08-26 22:20:00 PDT vanza
resolution_id None 2009-08-26 22:20:00 PDT vanza
assigned_to nobody 2009-08-23 10:31:01 PDT ezust
priority 5 2009-08-22 21:33:25 PDT dennis_sheil
File Deleted 336614: 2009-08-22 21:31:11 PDT dennis_sheil
File Added 340224: perl.patch 2009-08-22 21:30:57 PDT dennis_sheil
File Added 336614: perl.xml.patch 2009-07-25 19:44:53 PDT dennis_sheil