Change 'spamicity' to 'bogosity'
Fast Bayesian spam filter along lines suggested by Paul Graham
Brought to you by:
m-a
With the change of the default spam state descriptors from
Yes/Unsure/No to Spam/Unsure/Ham filters that make a
simple search for the string 'spam' will result in moving all
messages to the junk bin (or wherever) since 'spam' is in
the word 'spamicity' which is always present. Changing the
word 'spamicity' to 'bogosity' would avoid that potential
dataloss issue.
Logged In: YES
user_id=30510
David,
As you say, simply checking for "spam" is a bad idea.
Having your filter check for "X-Bogosity: Spam" should do
what you want.
Alternatively, look at the "header_format" option in
bogofilter.cf. You can change the text of the "X-Bogosity:"
line to most anything you want.
Logged In: YES
user_id=2788
I object to changing the defaults again just yet - and the
default is X-Bogosity. I wonder if you have an obsolete
bogofilter.cf file lying on the floor that you stumble across.
Try fewer wildcards, I use /^X-Bogosity: Spam,
tests=bogofilter/ for instance - that's safe enough and
works with the bogofilter defaults.
You can view the defaults with: bogofilter -CQQ
I'm reclassifying this as feature request, what it is.
Logged In: YES
user_id=617423
Huh? Good grief. I'm using KMail, not procmail. You can't set the
filters to look for "X-Bogosity" since that is this the name of the
header. Instead it looks through the contents of the specified
header. Basically, we can no longer have a KMail filter of the type
"header 'X-Bogosity' contains 'spam'". Instead we have to have
one more like "header 'X-Bogosity' matches regexp '\bspam\b'",
which is a heck of a lot less clear to average users. Hence the
request to change "spamicity" to "bogosity" as the first would
become viable again.
See KDE bug 93040 for more on how this came about.
http://bugs.kde.org/show_bug.cgi?id=93040
Logged In: YES
user_id=30510
David,
Are K-Mail's filters case sensitivity or able to handle
regexs? "Spam" stars with a capital S, while "spamicity"
uses lower case. Alternatively, check for "^Spam". Using
the config file (or command line option), "Spam/Ham/Unsure"
can be changed to any 3 words you want or "spamicity=..."
can be change to "score=...".
The flexibility is there for you to use. Please do so.
David
Logged In: YES
user_id=2788
If KMail (which I've never liked BTW because it doesn't allow me to go
without any local account and it insists on messing with my ~/Mail) does not
support Header "X-Bogofilter" _BEGINS WITH_ "..." then there's a task
for you.
Logged In: YES
user_id=617423
Mathias:
Are you intent on destroying goodwill or something? Your first
comment brought up some red herring of a config file and now
you seem to be intent on blasting away at KMail. Why the
resistance to changing the term 'spamicicty' to 'bogosity'?
Anyway, to answer David, it's not for me - I can put up with
whatever changes you throw at me. Rather, it's for ease of
creating automated facilities for setting up KMail to use Bogofilter
in a way that is no more difficult than using Mozilla's built-in
filters. Until the recent change this was relatively
straightforward. Now it has just been made more complex for no
apparent gain. Yes, regexps can and will be used but the
simplicity and understandability of a filter that used to only
contain the word 'yes' has now been lost. Either that or a bunch
more commandline options have to be used, but again, for no
apparent gain.
At any rate, since this is clearly going nowhere I might as well
stop. But that doesn't change the inherent unwisdom of using
the string 'spam' in every header line AND using it to denote a
message that has been found to be spam.
Logged In: YES
user_id=30510
David,
There's a good reason to use "Spam/Ham/Unsure" rather than
"Yes/No/Unsure". Tagging a message as "X-Bogosity: Yes"
doesn't provide much of a clue to a (human) reader,. Seeing
"Yes", I'm inclined to ask "Yes to what?" Seeing "Spam" or
"Ham" is much clearer.
I don't understand your role wrt kmail? Is it a personal
matter or are you packaging filter rules for others? The
format of the X-Bogosity line is "Spam/Ham/Unsure,
tests=bogofilter, spamicity=0.123455, version=1.2.3". I
gather that kmail's filtering is case insensitive, so that
rules out testing for "Spam" (instead of 'spam'). Since
kmail, can't test for "X-Bogosity: Spam", why not have it
test for "Spam, tests=bogofilter", which provides enough
context for a good test? If regexs are allowed, test for
"Spam.*spamicity".
HTH,
David
Logged In: YES
user_id=2788
David P. James,
your calling the configuration file red herring is impertinent.
First and foremost, fix your filter rule.
A. KMail supports regexp, so match "^spam"
B. KMail supports contains, match "spam,"
C. Bogofilter supports configuration file to set the tags, header layout all the
way you want it if you don't like our defaults.
D. Bogofilter isn't even released as 1.0, name us ONE reason why we
should care about dependent package compatibility.
Now will you stop wasting our time.
Thank you.