This last post describe a small problem about highlighting* with the Find Mark style ( No important issue ! ).
So, I finished all the tests about Regex search/Replacement, described in my last post, and I'm glad to tell you that, globally, everything seems OK** :)
In addition to the small bug, described in my last post, I just noticed an other issue, concerning recursive patterns.
IMPORTANT : This issue occurs on both actual version of N++ ( and certainly before ! ) and on your new code
Let us consider the subject string below, in a new file :
The search of <([^<>]|(?R))*> give the longest sequence <.....>, even multi-lines and/or EMPTY, containing, ONLY, WELL-imbricated other sequences <...>
Thus, the four strings <<54<6>4>, <<123<>78>904>, <> and <12345> are found, both with N++ and with the plug-in RegEx Helper
Now, consider the regex <([^<>]|(?R))+> ( The unique modification is the change of the star symbol * by the plus sign +, before the last symbol > )
Normally, this regex should search the longestNON EMPTY sequence <.....>, , even multi-lines, containing, ONLY, WELL-imbricated other NON EMPTY sequences <...>
Then, with N++, three strings are found : <<54<6>4>, <<123<>78>904> and <12345>
The second string <<123<>78>904> should not have been found ! It seems that it works only if it's out of the recursion phase !
But, with the plug-in RegEx Helper, two strings only are found : <<54<6>4> and <12345>
It's the correct behaviour !
What do you think of ?
Many thanks, again, for the corrections and improvements, in the Regex search/replacement engine !
I intend to create a new topic, concerning specific bugs and improvements about the Search/Replacement interface
Best Regards,
guy038
P.S.
By the way, I also tested your new character class [[:inval:]]. It works fine ! I just wrote an accentuated character, like é, in a dummy file. In UTF-8, it's normally coded with the two bytes \xc3 and \xa9.
So, with an other small Search/Replace editor, I replaced the first byte by, for example, the byte \xc1, which is always a forbidden value in an UTF-8 file.
Thus, in N++, this file was displayed with the symbols xC1 and xA9, and these bytes were correctly found with your [[inval:]] form.
François, the tiny Search/Replace editor, mtr.exe, that I'm speaking above, may interest you, especially for huge batch search/replacements, concerning hundred of files and/or hundred of simultaneous searches !
For this very powerful tool, called "Minitrue" v2.0.6, combines a text-viewer, a "grep" utility, a "less" pager utility and a fast search/replacement program, with the support of regular expressions !
Generally, this program is launched in a DOS session. But all actions can be memorized in batch files.
In addition, the list of files to scan and/or the list of strings to search and, eventually, the list of replacements to do, can all be stored in text files.
Although, its Regex syntax is a bit less powerfull than N++ PCRE syntax, it had some interesting other proprietary program options and Regex features !
But, it's better to place this program, directly, on a root drive, to avoid problems about the length of total path to access files ! Of course, named files containing spaces must be enclosed in double quotes.
The home page of the ( productive ! ) author , Jason Hoods, is at the address : http://adoxa.3eeweb.com
After downloading, just have a look to the fourteenth examples, at the end of his tutorial, with the -? help option, to be really convinced :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This last post describe a small problem about highlighting* with the Find Mark style ( No important issue ! ).
So, I finished all the tests about Regex search/Replacement, described in my last post, and I'm glad to tell you that, globally, everything seems OK** :)
In addition to the small bug, described in my last post, I just noticed an other issue, concerning recursive patterns.
IMPORTANT : This issue occurs on both actual version of N++ ( and certainly before ! ) and on your new code !
Let us consider the subject string below, in a new file :
---<54<6>4>---<<123<>78>904>----<>----<12345>----
The search of <([^<>]|(?R))*> give the longest sequence <.....>, even multi-lines and/or EMPTY, containing, ONLY, WELL-imbricated other sequences <...>, even multi-lines and/or EMPTY
Thus, the four strings <<54<6>4>, <<123<>78>904>, <> and <12345> are found, both with N++ and with the plug-in RegEx Helper
Now, consider the regex <([^<>]|(?R))+> ( The unique modification is the change of the star symbol * by the plus sign +, before the last symbol > )
Normally, this regex should search the longestNON EMPTY sequence <.....>, , even multi-lines, containing, ONLY, WELL-imbricated other NON EMPTY sequences <...>, even multi-lines
Then, with N++, three strings are found : <<54<6>4>, <<123<>78>904> and <12345>
The second string <<123<>78>904> should not have been found ! It seems that it works only if it's out of the recursion phase !
But, with the plug-in RegEx Helper, two strings only are found : <<54<6>4> and <12345>
It's the correct behaviour !
What do you think of ?
Many thanks, again, for the corrections and improvements, in the Regex search/replacement engine !
I created a new topic, concerning a specific bug and improvements, concerning the Search/Replacement interface , at the address :
By the way, I also tested your new character class *[[:inval:]]. It works fine !
I just wrote an accentuated character, like the character é, in a dummy file. In UTF-8, it's normally coded with the two bytes \xc3 and \xa9.
So, with an other small Search/Replace editor, I replaced the first byte by, for example, the byte \xc1, which is always a forbidden value in an UTF-8 file.
Thus, in N++, this file was displayed with the symbols xC1 and xA9, and these bytes were correctly found with your [[inval:]] form.
François, the tiny Search/Replace editor, mtr.exe, that I'm speaking above, may interest you, especially for huge batch search/replacements, concerning hundred of files and/or hundred of simultaneous searches !
For this very powerful tool, called "Minitrue" v2.0.6, combines a text-viewer, a "grep" utility, a "less" pager utility and a fast search/replacement program, with the support of regular expressions !
Generally, this program is launched in a DOS session. But all actions can be memorized in batch files.
In addition, the list of files to scan and/or the list of strings to search and, eventually, the list of replacements to do, can all be stored in text files.
Although, its regex syntax is a bit less powerfull than N++ PCRE syntax, it had some interesting other proprietary program options and Regex features !
But, it's better to place this program, directly, on a root drive, to avoid problems about the length of total path to access files ! Of course, named files containing spaces must be enclosed in double quotes.
The home page of the ( productive ! ) author , Jason Hoods, is at the address :
After downloading, just have a look to the fourteenth examples, at the end of his tutorial, with the -?help option, to be really convinced :)
Also, open a file and leave the TAB key pressed => The two views of the file, normal and hexadecimal, seems to be really simultaneous !!! Very efficient code :)
Last edit: THEVENOT Guy 2013-06-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, François,
First of all, just have a look to my two last posts, after I downloaded your NEW Scilexer.dll, at the addresses :
https://sourceforge.net/p/notepad-plus/discussion/331753/thread/9f4742f6/#d8e5
https://sourceforge.net/p/notepad-plus/discussion/331753/thread/9f4742f6/#d4f1
This last post describe a small problem about highlighting* with the Find Mark style ( No important issue ! ).
So, I finished all the tests about Regex search/Replacement, described in my last post, and I'm glad to tell you that, globally, everything seems OK** :)
In addition to the small bug, described in my last post, I just noticed an other issue, concerning recursive patterns.
IMPORTANT : This issue occurs on both actual version of N++ ( and certainly before ! ) and on your new code
Let us consider the subject string below, in a new file :
---<<54<6>4>---<<123<>78>904>----<>----<12345>----
The search of
<([^<>]|(?R))*>
give the longest sequence<.....>
, even multi-lines and/or EMPTY, containing, ONLY, WELL-imbricated other sequences<...>
Thus, the four strings
<<54<6>4>
,<<123<>78>904>
,<>
and<12345>
are found, both with N++ and with the plug-in RegEx HelperNow, consider the regex
<([^<>]|(?R))+>
( The unique modification is the change of the star symbol*
by the plus sign+
, before the last symbol>
)Normally, this regex should search the longest NON EMPTY sequence
<.....>
, , even multi-lines, containing, ONLY, WELL-imbricated other NON EMPTY sequences<...>
Then, with N++, three strings are found :
<<54<6>4>
,<<123<>78>904>
and<12345>
The second string
<<123<>78>904>
should not have been found ! It seems that it works only if it's out of the recursion phase !But, with the plug-in RegEx Helper, two strings only are found :
<<54<6>4>
and<12345>
It's the correct behaviour !
What do you think of ?
Many thanks, again, for the corrections and improvements, in the Regex search/replacement engine !
I intend to create a new topic, concerning specific bugs and improvements about the Search/Replacement interface
Best Regards,
guy038
P.S.
By the way, I also tested your new character class [[:inval:]]. It works fine ! I just wrote an accentuated character, like é, in a dummy file. In UTF-8, it's normally coded with the two bytes \xc3 and \xa9.
So, with an other small Search/Replace editor, I replaced the first byte by, for example, the byte \xc1, which is always a forbidden value in an UTF-8 file.
Thus, in N++, this file was displayed with the symbols xC1 and xA9, and these bytes were correctly found with your [[inval:]] form.
François, the tiny Search/Replace editor, mtr.exe, that I'm speaking above, may interest you, especially for huge batch search/replacements, concerning hundred of files and/or hundred of simultaneous searches !
For this very powerful tool, called "Minitrue" v2.0.6, combines a text-viewer, a "grep" utility, a "less" pager utility and a fast search/replacement program, with the support of regular expressions !
Generally, this program is launched in a DOS session. But all actions can be memorized in batch files.
In addition, the list of files to scan and/or the list of strings to search and, eventually, the list of replacements to do, can all be stored in text files.
Although, its Regex syntax is a bit less powerfull than N++ PCRE syntax, it had some interesting other proprietary program options and Regex features !
You can download it at the address : http://adoxa.3eeweb.com/minitrue
But, it's better to place this program, directly, on a root drive, to avoid problems about the length of total path to access files ! Of course, named files containing spaces must be enclosed in double quotes.
The home page of the ( productive ! ) author , Jason Hoods, is at the address : http://adoxa.3eeweb.com
After downloading, just have a look to the fourteenth examples, at the end of his tutorial, with the -? help option, to be really convinced :)
Hello, François,
Oups, I should have had a problem when creating this new topic !!
First of all, just have a look to my two last posts, after I downloaded your NEW Scilexer.dll, at the addresses :
https://sourceforge.net/p/notepad-plus/discussion/331753/thread/9f4742f6/#d8e5
https://sourceforge.net/p/notepad-plus/discussion/331753/thread/9f4742f6/#d4f1
This last post describe a small problem about highlighting* with the Find Mark style ( No important issue ! ).
So, I finished all the tests about Regex search/Replacement, described in my last post, and I'm glad to tell you that, globally, everything seems OK** :)
In addition to the small bug, described in my last post, I just noticed an other issue, concerning recursive patterns.
IMPORTANT : This issue occurs on both actual version of N++ ( and certainly before ! ) and on your new code !
Let us consider the subject string below, in a new file :
---<54<6>4>---<<123<>78>904>----<>----<12345>----
The search of
<([^<>]|(?R))*>
give the longest sequence<.....>
, even multi-lines and/or EMPTY, containing, ONLY, WELL-imbricated other sequences<...>
, even multi-lines and/or EMPTYThus, the four strings
<<54<6>4>
,<<123<>78>904>
,<>
and<12345>
are found, both with N++ and with the plug-in RegEx HelperNow, consider the regex
<([^<>]|(?R))+>
( The unique modification is the change of the star symbol*
by the plus sign+
, before the last symbol>
)Normally, this regex should search the longest NON EMPTY sequence
<.....>
, , even multi-lines, containing, ONLY, WELL-imbricated other NON EMPTY sequences<...>
, even multi-linesThen, with N++, three strings are found :
<<54<6>4>
,<<123<>78>904>
and<12345>
The second string
<<123<>78>904>
should not have been found ! It seems that it works only if it's out of the recursion phase !But, with the plug-in RegEx Helper, two strings only are found :
<<54<6>4>
and<12345>
It's the correct behaviour !
What do you think of ?
Many thanks, again, for the corrections and improvements, in the Regex search/replacement engine !
I created a new topic, concerning a specific bug and improvements, concerning the Search/Replacement interface , at the address :
https://sourceforge.net/p/notepad-plus/discussion/331753/thread/328af373/#e087
Best Regards,
guy038
P.S.
By the way, I also tested your new character class *
[[:inval:]]
. It works fine !I just wrote an accentuated character, like the character é, in a dummy file. In UTF-8, it's normally coded with the two bytes \xc3 and \xa9.
So, with an other small Search/Replace editor, I replaced the first byte by, for example, the byte \xc1, which is always a forbidden value in an UTF-8 file.
Thus, in N++, this file was displayed with the symbols xC1 and xA9, and these bytes were correctly found with your
[[inval:]]
form.François, the tiny Search/Replace editor, mtr.exe, that I'm speaking above, may interest you, especially for huge batch search/replacements, concerning hundred of files and/or hundred of simultaneous searches !
For this very powerful tool, called "Minitrue" v2.0.6, combines a text-viewer, a "grep" utility, a "less" pager utility and a fast search/replacement program, with the support of regular expressions !
Generally, this program is launched in a DOS session. But all actions can be memorized in batch files.
In addition, the list of files to scan and/or the list of strings to search and, eventually, the list of replacements to do, can all be stored in text files.
Although, its regex syntax is a bit less powerfull than N++ PCRE syntax, it had some interesting other proprietary program options and Regex features !
You can download it at the address :
http://adoxa.3eeweb.com/minitrue
But, it's better to place this program, directly, on a root drive, to avoid problems about the length of total path to access files ! Of course, named files containing spaces must be enclosed in double quotes.
The home page of the ( productive ! ) author , Jason Hoods, is at the address :
http://adoxa.3eeweb.com
After downloading, just have a look to the fourteenth examples, at the end of his tutorial, with the -? help option, to be really convinced :)
Also, open a file and leave the TAB key pressed => The two views of the file, normal and hexadecimal, seems to be really simultaneous !!! Very efficient code :)
Last edit: THEVENOT Guy 2013-06-18