This patch is run against the 0.7.3 source base. It was written by Eric Seppanen.
The output of "diff -ur bogofilter-0.7.3/bogofilter.c bogofilter-0.7.3eds/bogofilter.c" is attached as a
Logged In: YES
Comments from Eric Seppanen (the patch Author)
Attached is my multiple-wordlist patch. This patch lays out the
infrastucture needed to allow any number of wordlists. It does not
attempt to modify the functionality or algorithms in bogofilter.
The things that can be done with this in place are:
- any number of wordlists
- any list can have its own "weight" (implemented as a multiplier, so
words on a list of weight 2.0 will have the same effect as twice as many
words on a list of weight 1.0)
- "ignore" lists that cause bogofilter to treat some words as probability
0.5, useful for eliminating header fields, date strings, or whatever else
- lists can forcefully override other lists. This is used for ignore
lists, but could also be used to create whitelists or blacklists that
override all other lists.
An obvious companion to this patch would be allowing a "bogofilterrc" file
where a user could specify their own set of wordlists. I haven't written
such a beast, yet, because of the likely style issues involved with
choosing a format and parser.
I'm not sure how well this gels with what other people are working on, so
I'm posting it here for comments.