#295 bug: infinite loop on search-and-replace

v3.8
closed-fixed
5
2015-03-07
2012-12-10
Anonymous
No

How to reproduce:

0. Start joe with an empty buffer.
1. Type abracadabra
2. Move the cursor to the beginning of the buffer.
3. Search-and replace the first 3 occurrences of a with X
4. Press Ctrl-<C> to abort the replace at the 4th occurrence of a.
5. Manually move the cursor, delete the last X and type an a instead.
6. Move the cursor to the beginning of the buffer.
7. Repeat the last seach-and-and-replace (Ctrl-<L>).
8. At this point joe falls to an infinite loop. The only way out is to kill it.

Here is a quick workaround patch which fixes the bug (also attached):

--- joe/usearch.c.orig 2012-12-10 22:50:28.957437060 +0100
+++ joe/usearch.c 2012-12-10 22:51:00.813781081 +0100
@@ -238,48 +238,36 @@
{
unsigned char *pattern;
P *start;
P *end;
int x;

pattern = srch->pattern;
start = pdup(p, USTR "searchf");
end = pdup(p, USTR "searchf");

- try_again:
-
for (x = 0; x != sLEN(pattern) && pattern[x] != '\\' && (pattern[x]<128 || !p->b->o.charmap->type); ++x)
if (srch->ignore)
pattern[x] = joe_tolower(p->b->o.charmap,pattern[x]);
wrapped:
while (srch->ignore ? pifind(start, pattern, x) : pfind(start, pattern, x)) {
pset(end, start);
pfwrd(end, (long) x);
if (srch->wrap_flag && start->byte>=srch->wrap_p->byte)
break;
if (pmatch(srch->pieces, pattern + x, sLEN(pattern) - x, end, 0, srch->ignore)) {
- if (end->byte == srch->last_repl) {
- /* Stuck? */
- pattern = srch->pattern;
- pset(start, p);
- if (pgetc(start) == NO_MORE_DATA)
- break;
- pset(end, start);
- goto try_again;
- } else {
srch->entire = vstrunc(srch->entire, (int) (end->byte - start->byte));
brmem(start, srch->entire, (int) (end->byte - start->byte));
pset(p, end);
prm(start);
prm(end);
return p;
- }
}
if (pgetc(start) == NO_MORE_DATA)
break;
}
if (srch->allow_wrap && !srch->wrap_flag && srch->wrap_p) {
msgnw(bw->parent, joe_gettext(_("Wrapped")));
srch->wrap_flag = 1;
p_goto_bof(start);
goto wrapped;
}
@@ -312,44 +300,33 @@

start = pdup(p, USTR "searchb");
end = pdup(p, USTR "searchb");

try_again:

for (x = 0; x != sLEN(pattern) && pattern[x] != '\\' && (pattern[x]<128 || !p->b->o.charmap->type); ++x)
if (srch->ignore)
pattern[x] = joe_tolower(p->b->o.charmap,pattern[x]);

- wrapped:
while (pbkwd(start, 1L)
&& (srch->ignore ? prifind(start, pattern, x) : prfind(start, pattern, x))) {
pset(end, start);
pfwrd(end, (long) x);
if (srch->wrap_flag && start->byte<srch->wrap_p->byte)
break;
if (pmatch(srch->pieces, pattern + x, sLEN(pattern) - x, end, 0, srch->ignore)) {
- if (start->byte == srch->last_repl) {
- /* Stuck? */
- pattern = srch->pattern;
- pset(start, p);
- if (prgetc(start) == NO_MORE_DATA)
- break;
- pset(end, start);
- goto try_again;
- } else {
srch->entire = vstrunc(srch->entire, (int) (end->byte - start->byte));
brmem(start, srch->entire, (int) (end->byte - start->byte));
pset(p, start);
prm(start);
prm(end);
return p;
- }
}
}

if (srch->allow_wrap && !srch->wrap_flag && srch->wrap_p) {
msgnw(bw->parent, joe_gettext(_("Wrapped")));
srch->wrap_flag = 1;
p_goto_eof(start);
goto wrapped;
}
srch->last_repl = -1;

Maybe there is a more complicated issue here. This patch disables the infinite loop prevention feature provided by srch->last_repl. I tried to understand the code, but I couldn't figure out an example (I even tried replacing x with xxxyyy, and then replacing x with yyyxxx in wrap-around mode) which would benefit from this feature -- maybe it doesn't prevent any infinite loops at all, and it's just dead code?

Discussion

  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2012-12-10

    quick workaround patch that fixes the bug

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2012-12-10

    FYI This affects version the latest revision in Mercurial, version 3.7, 3.6, 3.5 and 3.4. Versions 3.3 and below are not affected, because they don't contain srch->last_repl.

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2012-12-10

    FYI This bug is present in the latest revision in Mercurial, versions 3.7, 3.6, 3.5 and 3.4. Versions 3.3 and below are not affected because they don't contain srch->last_repl.

     
  • Egmont Koblinger

    joe-3.7 doesn't compile with the attached patch because you accidentally removed the "wrapped:" goto-label. I attach a fixed version of the patch, the only difference is that it doesn't remove that label.

     
  • John J. Jordan

    John J. Jordan - 2015-02-21

    Might be useful to understand why this change was introduced before we go ahead and rip it out. For future reference (note to self), this was introduced by the March 5 '06 change "fix search bugs".

     
  • Egmont Koblinger

    FYI: I've been a heavy user of pts's branch for the last ~2-3 years and I've never experienced any bug with search-n-replace. I'd faced this infinite loop quite often before with mainstream joe.

    I think it's a safe bet to go ahead and apply pts's patch, even if we fail to understand the story behind.

     
  • John J. Jordan

    John J. Jordan - 2015-02-28

    The issue this originally addressed is one involving JOE's regexes and replace-all. It's been right under our nose the whole time, in hints-src.html:

    What's going on if I search for '\$' or '\+ \$' (trying to delete white space from ends of lines)?

    Remember that '\$' is a predicate- it matches a position, but does not take up any space. Also remember that '\+' matches zero or more of the character which follows. Thus '\+ \$' will find white space or nothing at the end of the line of the first line. If you try to search and replace this with an empty string, you will get an infinite loop. In JOE 3.4, this is detected, and a message is printed, but older versions of JOE would lock up.

    goto try_again;
    

    works fine if p is already at the match location (since it advances start), but not so much if starting from the beginning of the file.

    I propose this patch instead, which ought to address both problems. It's not without its problems, though. In particular, in the case of the aforementioned search/replace, if you type "n" at the replace prompt then you will keep matching the same location. That's less than desirable, but it looks like it would require a more extensive change to fix.

    EDIT: Fix escaping.

     
    Last edit: John J. Jordan 2015-02-28
  • John J. Jordan

    John J. Jordan - 2015-02-28

    I should have mentioned that the case above (answering 'n' matches the same text) occurs without the patch as well.

     
  • Egmont Koblinger

    I didn't know joe can do regexp-like stuff like \+ and \$ :) I always used \n for newline.

    Your patch introduces the regression that even the 'y' answer doesn't advance to the next occurrence.

    EDIT: Fix escaping :)

     
    Last edit: Egmont Koblinger 2015-02-28
  • John J. Jordan

    John J. Jordan - 2015-02-28

    Arg, you're right. Here's the one I originally wanted anyway :-). Basically, keep advancing p (q).

    EDIT: ehh, so this one is not so efficient, let me sleep on it... or maybe someone else gets to it.

     
    Last edit: John J. Jordan 2015-02-28
  • Egmont Koblinger

    Guess I should dig into the code too :) Well, definitely not today.

    With this patch, the original abracadabra test case skips over the 'a' that you manually reverted.

     
  • Joe Allen

    Joe Allen - 2015-03-02

    Note that the code copies p to start and advances one character. It does not write back to p except on a successful match. So you should only ever execute the "Stuck?" block one time. This means it's easy to fix with a flag:

    --- a/joe/usearch.c     Sun Mar 01 12:36:10 2015 -0500
    +++ b/joe/usearch.c     Mon Mar 02 15:57:24 2015 -0500
    @@ -240,6 +240,7 @@
            P *start;
            P *end;
            int x;
    +       int count = 0;
    
            pattern = srch->pattern;
            start = pdup(p, USTR "searchf");
    @@ -257,13 +258,14 @@
                    if (srch->wrap_flag && start->byte>=srch->wrap_p->byte)
                            break;
                    if (pmatch(srch->pieces, pattern + x, sLEN(pattern) - x, end, 0, srch->ignore)) {
    -                       if (end->byte == srch->last_repl) {
    +                       if (end->byte == srch->last_repl && !count) {
                                    /* Stuck? */
                                    pattern = srch->pattern;
                                    pset(start, p);
                                    if (pgetc(start) == NO_MORE_DATA)
                                            break;
                                    pset(end, start);
    +                               ++count;
                                    goto try_again;
                            } else {
                                    srch->entire = vstrunc(srch->entire, (int) (end->byte - start->byte));
    

    This does fix the test case and searches for "+ \$" still work properly... but I need to think on this some more. I think advancing one character is always enough to fix the zero-size regular expression problem (that's all the existing code could do..), but want to think about and test this some more to be sure.

     
  • John J. Jordan

    John J. Jordan - 2015-03-02

    Speculation here... I did some thinking on this over the weekend but didn't get to sit down and try it out (busy weekend).

    I don't know if it should be a separate bug, but fixing the replace prompt (yes, no, rest?) in the case of zero-length matches and replacement strings is definitely related to this. The check for srch->last_repl is responsible for part of its behavior as well. Somewhere there is logic that does something like this:

    if replace=yes then perform replace, advance pointer by length of replace string.
    if replace=no then advance pointer by length of match.

    And if either of those is zero, some strange things happen. In the replace=yes case, the last_repl check keeps it from misbehaving. But, it should probably be something like this:

    if replace=yes then:
    * Perform replace
    * if lengthof(replace string) = 0 and lengthof(match string = 0) then advance 1 character
    * else advance length of replace string
    if replace=no then advance pointer by max(1, length of match)

    I'm not sure whether last_repl is still needed if this logic is implemented. But I think doing the last_repl && srch->rest, in addition to the above might yield good results.

     
    Last edit: John J. Jordan 2015-03-02
  • Joe Allen

    Joe Allen - 2015-03-03

    I don't think it's good for the user to see the cursor advance one character for these null-replacements: it's weird (they then can't rely on the cursor landing at the end of the matched string).

    The "hidden variables" approach (last_repl) feels better, as ugly as it is in the code.

    Even so, I think it's broken in other ways: a search could be skipped if you switch windows, continue a replace but where the cursor happens to be at the same offset as the replacement from the different window. The buffer should be part of the last_repl...

    Also I'm thinking it should be used for searches too (cursor is stuck if you repeatedly hit Ctrl-L searching for \$).

     
  • Joe Allen

    Joe Allen - 2015-03-03

    Well I checked in a fix for this [ed8ba6]. It includes setting last_repl on plain searches so that the cursor doesn't get stuck with Ctrl-L. It still needs a more complete idea for a last search (it should include the buffer the search was in, not just the offset).

    I think this whole last_repl business is not needed for backwards search since it already advances by one character. Even so, I'm 100% sure so I left it in.

     

    Related

    Commit: [ed8ba6]

  • John J. Jordan

    John J. Jordan - 2015-03-07
    • status: open --> closed-fixed
    • assigned_to: Joe Allen
    • Group: --> v3.8
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks