Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#271 fix perl edit mode bug (bug #2808363)

closed-accepted
6
2009-08-27
2009-07-26
Dennis Sheil
No

The problem:

Trying to load the following example, and perfectly legal and working Perl program will freeze jEdit. I estimate that with the current regular expression jEdit uses, it would take several CPU years for the lookingAt method to parse the transliteration line of this simple, legal and working Perl test program:

test.pl:
#!/usr/bin/perl
$a = "banana";
print "$a\n";
$a =~ tr{a}{b}; # I like the letter a but not b so translate every a to b
print "$a\n";

If you look over on the bug #280363 thread, you can see many people encounter the same problem (although my Perl example is simplified from the real world, in the field statement that discovered the bug).

Anyhow, the problem is the regular expression that deals with transliteration in jEdit's perl.xml file is missing a question mark. It uses a greedy quantifier instead of a reluctant quantifier. Since java.util.regex uses a Nondeterministic Finite Automaton pattern matching engine, what results is a geometric progression. Thus, every new letter added to say a comment at the end of this statement doubles the amount of processing time of the parsing. As I said before, certain lines (like the one above) I estimate would take several CPU years to complete.

My patch simply adds one character to one file in jEdit - a missing question mark. It is added in the regular expression dealing with this transliteration in the perl.xml file. The question mark changes the greedy quantifier to a reluctant quantifier. This has absolutely no effect on the boolean return statement that the Matcher class's lookingAt statement returns. It simply gets rid of the pointless greedy quantifier, so that the lookingAt method does not hit the geometric progression with a scale factor of 2 when dealing with clean Perl code.

The bug #2808363 thread illuminates some of this, as does the Sun documentation for the Pattern and Matcher classes. This article from Javaworld is pretty helpful in terms of dealing with funky regular expressions from the Java library used - http://www.javaworld.com/javaworld/jw-09-2007/jw-09-optimizingregex.html?page=1

I should note that this patch does not solve every problem related to the geometric progression of cpu processing for this regular expression. It only solves it when the Perl code is good, like in my above example, or in the file attached to bug #2808363 that hit this bug in the field. My one-character patch solves the problem for these two scenarios. However, if the Perl code being edited is bad, with improper backslashes in the second field of tr curly brackets, which would cause the lookingAt method to return false - then the jEdit geometric progression processing bug is still hit. With the existing jEdit code, jEdit can hang for several cpu years whether the boolean return of lookingAt is true or false. Ths patch cancels that geometric progression if lookingAt is true (good Perl code with no improper backslashes in the second curly bracket transliteration field) but not if lookingAt is false (Perl code is broken, bad backslashes in the second curly bracket transliteration field).

Discussion

  • Dennis Sheil
    Dennis Sheil
    2009-07-26

    Sourceforge word wrap happens before 74 characters apparently, so you can use the below program to test if you want (I just remove the last space letter from the comment on line 4)

    test.pl:
    #!/usr/bin/perl
    $a = "banana";
    print "$a\n";
    $a =~ tr{a}{b}; # I like the letter a but not b so translate every a
    print "$a\n";

     
  • Dennis Sheil
    Dennis Sheil
    2009-08-23

    Patches reported bug when in Perl mode

     
    Attachments
  • Dennis Sheil
    Dennis Sheil
    2009-08-23

    As I mentioned on the bug page, this bug affects programs that use the y operator as well as the tr operator. I updated the patch (with the latest svn version) so that the problem behavior is patched for not only the reported tr operator, but for the y operator.

     
  • Dennis Sheil
    Dennis Sheil
    2009-08-23

    • priority: 5 --> 6
     
  • Alan Ezust
    Alan Ezust
    2009-08-23

    Assigning to Marcelo, who has done work on the perl mode before.

     
  • Alan Ezust
    Alan Ezust
    2009-08-23

    • assigned_to: nobody --> vanza
     
  • Marcelo Vanzin
    Marcelo Vanzin
    2009-08-27

    Checked in rev #16080.

     
  • Marcelo Vanzin
    Marcelo Vanzin
    2009-08-27

    • status: open --> closed-accepted