When doing very long semi-automatic replacements, it can happen to kill the bot and to start again. So you have to say "no" again to all non wanted replacements. It is even worse if you're using an xml dump: it can be several weeks old, and it will make you download lot of pages that where ALREADY corrected.
This patch consist in two parts:
1) a patch to replace.py that adds a new parameter, "-exclude", and makes it accept a path to a file which will be used both for:
-> knowing which articles to exclude from substitution
-> logging denied replaces' pages and pages already known to be not needing replacements
2) a patch to pagegenerators.py that adds a generator filter, able to yield only pages not appearing in a given list
The only doubt I have is: should the replace.py log in some other way? xml? wikipedia module's predefined functions? log into a given wikipedia userpage (so that logs can easily be shared)?
As I've done it, it needs to import os and codecs modules... don't know if it's a problem.
Anyway, a patch like this is something really needed, if needed I can try to improve it.