Bug and suggestions about S/R dialog !

2013-06-16
2013-07-17
  • THEVENOT Guy

    THEVENOT Guy - 2013-06-16

    Hi Don, Dave, and all,

    I tested, during two weeks, the latest regex code, of François, in its new SciLexer.dll

    I just noticed two minor bugs, that I described in two posts to François, at the addresses below :

    https://sourceforge.net/p/notepad-plus/discussion/331753/thread/9f4742f6/#d4f1

    https://sourceforge.net/p/notepad-plus/discussion/331753/thread/cd39bf33/#999f


    But, to sum up, this new code seems correctly respect all the advanced PRCE features :

    • Assertions, lookarounds, conditional blocks, recursive patterns, internal modifiers,... in SEARCH

    • Pre-defined internal named groups, lexical groups, conditional blocks, case modifiers,... in REPLACEMENT

    It, also, correctly handles Unicode characters, the NULL character and NON UTF-8 characters.

    Many thanks for that great achievement !


    Concerning the Search/Replacement interface, I would like to point out a bug, when the Regex search is invalid.

    May be, this bug won't exist any more, in the next version of N++ !!

    If we're searching, for example, for the invalid regex  \x30}  ( instead of the correct regex \x{30}` for the digit '0' )

    • If you press the "Find Next" or the "Replace" button, you get the normal box message " Invalid regular expression".

    • But, if you press any other button of the Find/Replace windows, in any tab ( Count, Mark, ...), you get, the incorrect message " 0 matches. ", without any warning message displayed, instead !


      Could you also consider these four small improvements below :


      A)

    During my tests, I, very often, obtained 0 result, because of leading and/or trailing spaces, in my search regex, written by mistake.

    Since the width of a space are quite small, would it be possible to spot them, by highlighting or anything else ?!

    B)

    When you choose the Regex search mode, the direction of search is automatically set to Down. It's, indeed, a wise solution. However, you can still use the SHIFT + F3 shortcut to perform a backward search, in Regex mode

    With some simple regex patterns, the backward search is quite correct, but, with complicated patterns, especially when using lookbehinds, results may be completely wrong !

    So, when you begins a backward Regex search with the SHIFT + F3 shortcut, it would be interesting to display a warning message like, for example :

    Backward Regex search may result in unexpected results !

    C)

    Would it be possible to add two default shortcuts below ?

    • A shortcut to immediately display the Mark tab of the Search/Replacement dialog, like the three other shortcuts CTRL + F, CTRL + H and CRTL + SHIFT + F.

    • A second one to clear all the red Mark find style, as after a mouse click on the Clear all marks button.

      D)

    When the Search/Replacement dialog is open, in the Find tab, a mouse click or a hit on the Find Next button goes to the next search.

    Would it be possible to go to the previous search, if the SHIFT key is pressed, while clicking on the Find Next button or while pressing the ENTER key ?

    Best Regards,

    guy038

     
    Last edit: THEVENOT Guy 2013-06-16
  • François-R Boyer

    Hi,

    The recursive expressions crashing N++ (in all versions of N++ using Boost Regex) is due to incorrect handling of stack overflow exceptions. Current N++ exception code does not put back the marked page to check for stack overflow, and Scintilla is compiled with /EHs instead of /EHa, thus hardware exceptions in Scintilla are not handled by current N++ code. We could also change to the non-recursive version of Boost Regex, which should not overflow stack, but it is said to be a bit slower (I have not done any testing on that).

    The problem of incorrect results in some recursive expressions seems to be in Boost Regex, not in my code. This will have to be reported to Boost.

    A) A better way to edit search/replace expressions would be to have a Scintilla editor, with "show all characters".

    B) Do you have an example where the backward search is incorrect? People might have to be warned that backward search is slow, but it should not be incorrect as it is really an iterative forward search until current position.

    D) I also think that the find next/previous shortcuts (usually F3 and shift-F3) should work when the focus is in the find window, and should find the new expression not the previous one.

    François

     
  • THEVENOT Guy

    THEVENOT Guy - 2013-06-16

    Hi, François,

    I really sorry to make you doubt about you excellent regex code :)

    I tried some regexes, even complicated ones, and most of them work fine, in backward regex search.

    However, in some cases, the use of SHIFT + F3 ( backward regex search ) find, in addition, some sub-strings of the original occurrences found in normal forward search, with F3 key.

    This behaviour is not a real issue and, most of the time, it does match the search regex pattern !


    Just try the three examples, below, using both shortcuts F3 and SHIFT + F3, to see differences between search in the two directions !

    1) Search for  (?<=a)ba*  on the subject string below :

    aaaabaaababbbaabbabb


    2) Search for <([^<>]|(?R))*> on the subject string below :

    ---<<54<6>4>---<<123<>78>904>----<>>----<12345>----

    Try, also, the regex  <([^<>]|(?R))+>  on the same subject string


    3) Search for (abc|def|ghi){3,4} on the subject string below :

    abcabcabcabcabcabcdefdefdefdefdefdefghighighighighighi

    And, also, the regex (abc){3,4}|(def){3,4}|(ghi){3,4} on the same subject string


    As you can see, it's easier to understand while testing than while explaining !

    So set your mind at ease ! For the time being, I haven't found any real wrong behaviour of your regex code, in backward regex search.

    Of course, if I see any real search error, concerning backward search, I'll post you how to reproduce it.

    Best Regards,

    guy038

     
    Last edit: THEVENOT Guy 2013-06-18
  • François-R Boyer

    Hi Guy,

    The fact that backward search does not give the same results as forward search is by design, and my test code verifies that the differences are as intended. The problem is in N++ interface not calling the search code with the correct position. Take as example a search in "normal" mode, for "abab" in text "ababababab". According to what I think is correct, searching forward should find two matches "[abab][abab]ab", and searching backward should also find two matches "ab[abab][abab]". In forward mode, this is what N++ gives, but in backward mode it gives four matches, because search start position is placed one character left of cursor, which is at the right of previous match, while it should search from the left of previous match if we want a behavior similar to forward search.

    So, to have what I this is correct behavior, place the cursor one character right of the left of previous match before pressing shift+F3 again. In your example 3, a forward search gives the four matches "[abcabcabcabc][abcabcdefdef][defdefdefdef][ghighighighi]ghighi" and a backward search the four matches "abcabc[abcabcabcabc][defdefdefdef][defdefghighi][ghighighighi]".

    François

     
  • THEVENOT Guy

    THEVENOT Guy - 2013-06-26

    Hello François,

    I understood what you meant, concerning the right behaviour of the backward search, in regex mode. And I agree with you.

    So, to perform a correct backward search, in regex mode, we must follow the manual steps below :

    • Open the S/R dialogue

    • Search, in regex mode, the regex pattern, ONLY ONCE ( in forward direction, necessarily ! )

    • Close the S/R dialogue

    • A)

    • Hit the left direction key ( selection disappears and cursor is at the start of the occurrence matched )

    • Hit the right direction key ( then, cursor is moved at the right of the start of the previous selection )

    • Perform a backward search with the short-cut SHIFT + F3

    • Go back to the point A) for an other correct backward search, and so on...

    With that trick, backward search should always be OK, even with complicated regex pattern

    François, I was wondering why it shouldn't be possible to include this automatic right move of the cursor, in case of backward regex search ?

    In natural language, a simple algorithm would be, for example :

    if Regex mode set AND hit on SHIFT + F3 // Regex backward search

    then

    if exist previous selection // due to a previous SHIFT + F3 search or anything else !

    then

    move cursor to beginning of the selection

    Suppress the selection

    endif

    if cursor is not at the very end of the file

    then

    Cursor position = cursor position + 1

    endif

    start a backward regex search

    endif

    It will remain a special case, if cursor is at the very end of the current file and selection of the first backward search ends, also, at the VERY END of the current file. I think that we can't do anything about it !

    For example, if the last line of a file is
    "abcabcabcabcabcabcdefdefdefdefdefdefghighighighighighi", without an EOL character,and if the cursor at the very end, then the first SHIFT + F3 search of the regex (abc|def|ghi){3,4} would find :

    abcabcabcabcabcabcdefdefdefdefdefdefghighighighighighi

    instead of : abcabcabcabcabcabcdefdefdefdefdefdefghighighighighighi, which is the right behaviour !

    Cheers,

    Guy038

     
    Last edit: THEVENOT Guy 2013-06-26
  • Byzod

    Byzod - 2013-06-28

    Search back with perl regex is a bad idea, I suppose. Note that direction up for regex search is disabled in search dialogue. Shift + F3 should be disabled for regex mode too.

     
  • François-R Boyer

    I do not know why backward search has been disabled in dialog somewhere between N++ 6.2 and 6.3. And I do not know why it would be a "bad idea" to search backward. Of course it is much less obvious to correctly do a backward search, but all current tests done on my code are working also backward. It is the interface that is not correctly calling my code, calling it with the wrong starting position. Of course, backward search is much slower, as it is implemented as an iterative forward search, but I think it might be useful to keep it, and have the interface modified to call it correctly.

     
  • THEVENOT Guy

    THEVENOT Guy - 2013-06-29

    Hello François, Byzod,

    @ Byzod,

    I read your post and I was about to think like you ! But François answered soon after your post and, now, I think that arguments of François are correct.

    In fact, backward search issue doesn't concern regex patterns ONLY. It also occurs in normal or extended search This behaviour proves, with no error, that the François regex code is quite good and that the issue occurs at the interface level.

    @ François,

    As I said above to Byzod, let suppose, for example, the subject string AAAAAAAA ( 8 letters 'A' ), inserted as a line, in a text.

    Then a simple backward normal search of the string AA, with a hit on SHIFT + F3 or by setting search direction to Up, is OK ( as you said ) ONLY IF , after each occurrence is found, cursor is moved just at right of the start position of the previous search ( with the couple : Left arrow key - Right arrow key )

    If we're searching for the string \d065\d065 in extended mode, we must use this trick, again, to obtain the good behaviour.

    Cheers,

    guy038

    P.S. :

    By the way, in ANY search mode, when you do a search AND a replacement at the same time, it's better to avoid the step by step replacement , with the Replace button. Just use the Replace All button, instead !

    For example, the search of AA and the replacement with AAA, either in forward or backward direction, change 4 occurrences of AA of the subject string above, into the 12 bytes string AAAAAAAAAAAA ( 4 * 3 Chars ) when you click on the Replace All button.

    If you use the Replace button ONLY, either in forward or backward direction, this specific S/R would never end !

     
    Last edit: THEVENOT Guy 2013-06-29
  • THEVENOT Guy

    THEVENOT Guy - 2013-07-16

    Hello François,

    Because of personal and professional matters, I didn't post anything since the end of June, except for some questions to Don, relative to new versions of N++.

    I just noticed that the recent versions of N++ ( 6.4, 6.4.1 and 6.4.2 ) don't use your regex code yet :(

    Luckily, as the file SciLexer.dll doesn't change since the 6.3.0 version of N++, I just had to replace it with YOUR version , for better regex searches !

    So, I wondered if you are still improving your code, especially concerning regex backward search. By the way, to this matter, did you see my two last posts to you, at the addresses :

    http://sourceforge.net/p/notepad-plus/discussion/331753/thread/328af373/#a1a0

    http://sourceforge.net/p/notepad-plus/discussion/331753/thread/328af373/#7c57

    Good coding !

    Cheers,

    Guy038

     
    Last edit: THEVENOT Guy 2013-07-16
    • Don HO

      Don HO - 2013-07-17

      I just noticed that the recent versions of N++ ( 6.4, 6.4.1 and 6.4.2 ) don't use your regex code yet :(

      If any patch regarding the enhancement of PCRE is submitted, I'll integrate it in the next version. So far I don't get any your patch by mail neither in Patch tracker, or am I missing something?

      Don

       
  • THEVENOT Guy

    THEVENOT Guy - 2013-07-17

    Hello Don,

    Concerning a possible patch of François-R Boyer, relative to regex search, I think that you didn't miss anything.

    The new regex code of François-R Boyer which fix some issues and add some new features, is available at the address below :

    https://sourceforge.net/projects/npppythonplugsq/files/Beta%20N%2B%2B%20regex%20code/

    I suppose that François is still involving in problems with backward regex search. So, I think that he prefers to do some more investigations and corrections, before proposing his patch ! It's a wise decision. So, I just asked him for some news !

    By the way, the backward search issue occurs, also, in extended and normal search mode ! See my post at the address below :

    http://sourceforge.net/p/notepad-plus/discussion/331753/thread/328af373/#7c57

    Cheers,

    guy038

     
    Last edit: THEVENOT Guy 2013-07-20

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks