I've been editing html documents manually for a while now. I'm very happy to be able to put the repetitive commands into a script. I think I may be using a command incorrectly though. I am "unwrapping" then "rewrapping" the text to get each paragraph on its own line. The unwrapping works fine but when I use the
editor.rereplace(r"</p> ", r"</p>\n")
command it takes just over 10 MINUTES to execute. It performs the replace accurately, but just takes forever to do it. If I use the built-in replace command with the same parameters on the same document it takes just under 5 SECONDS to complete…. So my assumption is that I'm not using the command correctly. Can someone please point me in the right direction?? Thanks!
using version 0.9.2.0 on notepad++ 5.9.3 Win7 64bit
notepad.runMenuCommand("TextFX Tools", "Unwrap Text")
editor.rereplace(r"</p> ", r"</p>\n")
How is it with editor.replace() instead of editor.rereplace()?
As you're not using any regular expression, editor.replace should be fine. It might be slightly slower than normal Notepad++ (it's got to jump in and out of python for each replacement, so it's likely to take a little bit longer than normal).
I can imagine the Regular Expression version takes longer, as it's got to parse the expression each time.
If it's still slow, can you give me some details about how many files you're replacing in, and how big they are?
Thanks for the quick reply!
I tried editor.replace before but it didn't recognize the "\n" for the line return - or rather it recognized it as part of the text and just put a bunch of "\n" in the text instead of a line feed. I assumed I needed to use the rereplace for it to actually insert the line feed.
I am usually only replacing in 1 file at a time, but they can be fairly large. This particular test document is 970,000+ characters and it had to replace 5800 line feeds.
I just completed another test run with the editor.replace command, and it took the same 10 minutes. A little clarification on my first post: when I said "built-in" command, I meant the replace command accessed from the "search" menu bar in N++.
I was just playing around with some different commands and used:
editor.pyreplace(r"</p> ", r"</p>\n")
That worked and was actually FASTER than the built-in command. It completed the 5800 replacements before I had a chance to start/stop the watch!!
I wonder if that means that there is a bug within the editor.replace and/or editor.rereplace commands??
It seems that (per the manual) pyreplace does its replace one line at a time. If I run the pyreplace command AFTER I have unwrapped the document - everything is in a single line - then it replaces all the items in that single line at the same time (very fast).
If I run the same command when the document is split into multiple lines (1 line per paragraph) then it sequences through line by line and it takes significantly longer.
I hope this is helping to find a reason for this issue…and I'm not just barking up the wrong tree! :)
I suspect that SCI_FINDTEXT (the command that is used in editor.replace and editor.rereplace) moves the "gap" in Scintilla. The recommendation is to do all the "finds" in one go, and then do all the replacements, which editor.replace and editor.rereplace don't do. I suspect editor.pymlreplace will be much faster in your case, as that takes a copy of the whole buffer and then does everything "python-side", and then sets the text at the end. How quickly python itself can replace the text, I'm not sure.
When I get some time to work on Python Script, I'll try and add some support for faster replacing.
editor.pyreplace does do things line by line, so if it's faster when it's all on one line, then I should imagine that pymlreplace would be just as fast when it's on multiple lines, as effectively that's the same thing.
Hope that helps!