[Bogofilter-cvs] bogofilter autodaemon.c,1.1.1.1,1.2 autodaemon.h,1.1.1.1,1.2 bogofilter.c,1.13,1.14
Fast Bayesian spam filter along lines suggested by Paul Graham
Brought to you by:
m-a
Update of /cvsroot/bogofilter/bogofilter In directory usw-pr-cvs1:/tmp/cvs-serv27172 Modified Files: autodaemon.c autodaemon.h bogofilter.c bogofilter.h lexer_l.l lock.c lock.h main.c Log Message: Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. Index: autodaemon.c =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/autodaemon.c,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -C2 -d -r1.1.1.1 -r1.2 *** autodaemon.c 14 Sep 2002 22:15:20 -0000 1.1.1.1 --- autodaemon.c 23 Sep 2002 11:31:53 -0000 1.2 *************** *** 1,7 **** /* $Id$ */ ! /* $Log$ ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ /***************************************************************************** --- 1,11 ---- /* $Id$ */ ! /* ! * $Log$ ! * Revision 1.2 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ /***************************************************************************** Index: autodaemon.h =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/autodaemon.h,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -C2 -d -r1.1.1.1 -r1.2 *** autodaemon.h 14 Sep 2002 22:15:20 -0000 1.1.1.1 --- autodaemon.h 23 Sep 2002 11:31:53 -0000 1.2 *************** *** 1,7 **** /* $Id$ */ ! /* $Log$ ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ // autodaemon.h -- lightweight library for daemon processes --- 1,11 ---- /* $Id$ */ ! /* ! * $Log$ ! * Revision 1.2 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ // autodaemon.h -- lightweight library for daemon processes Index: bogofilter.c =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/bogofilter.c,v retrieving revision 1.13 retrieving revision 1.14 diff -C2 -d -r1.13 -r1.14 *** bogofilter.c 23 Sep 2002 10:08:49 -0000 1.13 --- bogofilter.c 23 Sep 2002 11:31:53 -0000 1.14 *************** *** 1,61 **** /* $Id$ */ - /* $Log$ - /* Revision 1.13 2002/09/23 10:08:49 m-a - /* Integrate patch by Zeph Hull and Clint Adams to present spamicity in - /* X-Spam-Status header in bogofilter -p mode. - /* - /* Revision 1.12 2002/09/22 21:26:28 relson - /* Remove the definition and use of strlwr() since get_token() in lexer_l.l already converts the token to lower case. - /* - /* Revision 1.11 2002/09/19 03:20:32 relson - /* Move "msg_prob" assignment to proper function, i.e. from select_indicators() to compute_probability(). - /* Move some local variables from the beginning of the function to the innermost block where they're needed. - /* - /* Revision 1.10 2002/09/18 22:41:07 relson - /* Separated probability calculation out of select_indicators() into new function compute_probability(). - /* - /* Revision 1.7 2002/09/15 19:22:51 relson - /* Refactor the main bogofilter() function into three smaller, more coherent pieces: - /* - /* void *collect_words(int fd) - /* - returns a set of tokens in a Judy array /* ! /* bogostat_t *select_indicators(void *PArray) ! /* - processes the set of words ! /* - returns an array of spamicity indicators (words & probabilities) ! /* ! /* double compute_spamicity(bogostat_t *stats) ! /* - processes the array of spamicity indicators ! /* - returns the spamicity ! /* ! /* rc_t bogofilter(int fd) ! /* - calls the 3 component functions ! /* - returns RC_SPAM or RC_NONSPAM ! /* ! /* Revision 1.6 2002/09/15 19:07:13 relson ! /* Add an enumerated type for return codes of RC_SPAM and RC_NONSPAM, which values of 0 and 1 as called for by procmail. ! /* Use the new codes and type for bogofilter() and when generating the X-Spam-Status message. ! /* ! /* Revision 1.5 2002/09/15 18:29:04 relson ! /* bogofilter.c: ! /* ! /* Use a Judy array to provide a set of (unique) tokens to speed up the filling of the stat.extrema array. ! /* ! /* Revision 1.4 2002/09/15 17:41:20 relson ! /* The printing of tokens used for computing the spamicity has been changed. They are now printed in increasing order (by probability and alphabet). The cumulative spamicity is also printed. ! /* ! /* The spamicity element of the bogostat_t struct has become a local variable in bogofilter() as it didn't need to be in the struct. ! /* ! /* Revision 1.3 2002/09/15 16:37:27 relson ! /* Implement Eric Seppanen's fix so that bogofilter() properly populates the stats.extrema array. ! /* A new word goes into the first empty slot of the array. If there are no empty slots, it replaces ! /* the word with the spamicity index closest to 0.5. ! /* ! /* Revision 1.2 2002/09/15 16:16:50 relson ! /* Clean up underflow checking for word counts by using max() instead of if...then... ! /* ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ /***************************************************************************** --- 1,65 ---- /* $Id$ */ /* ! * $Log$ ! * Revision 1.14 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.13 2002/09/23 10:08:49 m-a ! * Integrate patch by Zeph Hull and Clint Adams to present spamicity in ! * X-Spam-Status header in bogofilter -p mode. ! * ! * Revision 1.12 2002/09/22 21:26:28 relson ! * Remove the definition and use of strlwr() since get_token() in lexer_l.l already converts the token to lower case. ! * ! * Revision 1.11 2002/09/19 03:20:32 relson ! * Move "msg_prob" assignment to proper function, i.e. from select_indicators() to compute_probability(). ! * Move some local variables from the beginning of the function to the innermost block where they're needed. ! * ! * Revision 1.10 2002/09/18 22:41:07 relson ! * Separated probability calculation out of select_indicators() into new function compute_probability(). ! * ! * Revision 1.7 2002/09/15 19:22:51 relson ! * Refactor the main bogofilter() function into three smaller, more coherent pieces: ! * ! * void *collect_words(int fd) ! * - returns a set of tokens in a Judy array ! * ! * bogostat_t *select_indicators(void *PArray) ! * - processes the set of words ! * - returns an array of spamicity indicators (words & probabilities) ! * ! * double compute_spamicity(bogostat_t *stats) ! * - processes the array of spamicity indicators ! * - returns the spamicity ! * ! * rc_t bogofilter(int fd) ! * - calls the 3 component functions ! * - returns RC_SPAM or RC_NONSPAM ! * ! * Revision 1.6 2002/09/15 19:07:13 relson ! * Add an enumerated type for return codes of RC_SPAM and RC_NONSPAM, which values of 0 and 1 as called for by procmail. ! * Use the new codes and type for bogofilter() and when generating the X-Spam-Status message. ! * ! * Revision 1.5 2002/09/15 18:29:04 relson ! * bogofilter.c: ! * ! * Use a Judy array to provide a set of (unique) tokens to speed up the filling of the stat.extrema array. ! * ! * Revision 1.4 2002/09/15 17:41:20 relson ! * The printing of tokens used for computing the spamicity has been changed. They are now printed in increasing order (by probability and alphabet). The cumulative spamicity is also printed. ! * ! * The spamicity element of the bogostat_t struct has become a local variable in bogofilter() as it didn't need to be in the struct. ! * ! * Revision 1.3 2002/09/15 16:37:27 relson ! * Implement Eric Seppanen's fix so that bogofilter() properly populates the stats.extrema array. ! * A new word goes into the first empty slot of the array. If there are no empty slots, it replaces ! * the word with the spamicity index closest to 0.5. ! * ! * Revision 1.2 2002/09/15 16:16:50 relson ! * Clean up underflow checking for word counts by using max() instead of if...then... ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ /***************************************************************************** Index: bogofilter.h =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/bogofilter.h,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** bogofilter.h 23 Sep 2002 10:08:49 -0000 1.4 --- bogofilter.h 23 Sep 2002 11:31:53 -0000 1.5 *************** *** 1,20 **** /* $Id$ */ ! /* $Log$ ! /* Revision 1.4 2002/09/23 10:08:49 m-a ! /* Integrate patch by Zeph Hull and Clint Adams to present spamicity in ! /* X-Spam-Status header in bogofilter -p mode. ! /* ! /* Revision 1.3 2002/09/18 22:30:22 relson ! /* Created lexer.h with the definitions needed by lexer_l.l from bogofilter.h. ! /* This removes the compile-time dependency between the two files. ! /* ! /* Revision 1.2 2002/09/15 19:07:12 relson ! /* Add an enumerated type for return codes of RC_SPAM and RC_NONSPAM, which values of 0 and 1 as called for by procmail. ! /* Use the new codes and type for bogofilter() and when generating the X-Spam-Status message. ! /* ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ ! // constants and declarations for bogofilter #include "lexer.h" --- 1,24 ---- /* $Id$ */ ! /* ! * $Log$ ! * Revision 1.5 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.4 2002/09/23 10:08:49 m-a ! * Integrate patch by Zeph Hull and Clint Adams to present spamicity in ! * X-Spam-Status header in bogofilter -p mode. ! * ! * Revision 1.3 2002/09/18 22:30:22 relson ! * Created lexer.h with the definitions needed by lexer_l.l from bogofilter.h. ! * This removes the compile-time dependency between the two files. ! * ! * Revision 1.2 2002/09/15 19:07:12 relson ! * Add an enumerated type for return codes of RC_SPAM and RC_NONSPAM, which values of 0 and 1 as called for by procmail. ! * Use the new codes and type for bogofilter() and when generating the X-Spam-Status message. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ ! /* constants and declarations for bogofilter */ #include "lexer.h" Index: lexer_l.l =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/lexer_l.l,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** lexer_l.l 22 Sep 2002 21:24:36 -0000 1.6 --- lexer_l.l 23 Sep 2002 11:31:53 -0000 1.7 *************** *** 1,33 **** /* $Id$ */ - /* $Log$ - /* Revision 1.6 2002/09/22 21:24:36 relson - /* Modify the lexer to allow the full range of alphabetic characters. - /* Thanks to Clint Adams for the new token matching expression. - /* - /* Revision 1.5 2002/09/18 22:30:22 relson - /* Created lexer.h with the definitions needed by lexer_l.l from bogofilter.h. - /* This removes the compile-time dependency between the two files. - /* - /* Revision 1.4 2002/09/18 20:56:43 m-a - /* Let automake deal with the lexer. - /* - /* Revision 1.3 2002/09/16 18:58:14 m-a - /* Fix 'last line occasionally emitted twice' bug, cleaning up our yyinput(). - /* - /* Revision 1.2 2002/09/15 15:52:24 relson /* ! /* ! /* Makefile.in: ! /* - fix .l.c rule so that lexer_l.c is correctly generated from lexer_l.l ! /* - added lexer_l.c to target mostlyclean-compile ! /* - removed lexer_l.c from DIST_COMMON. As it can can be generated, it no longer needs to be distributed. ! /* - added target lexertest (from original bogofilter release) ! /* ! /* lexer_l.l: ! /* - defined global variable passthrough so that linking lexertest succeeds. ! /* ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ %{ /* --- 1,37 ---- /* $Id$ */ /* ! * $Log$ ! * Revision 1.7 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.6 2002/09/22 21:24:36 relson ! * Modify the lexer to allow the full range of alphabetic characters. ! * Thanks to Clint Adams for the new token matching expression. ! * ! * Revision 1.5 2002/09/18 22:30:22 relson ! * Created lexer.h with the definitions needed by lexer_l.l from bogofilter.h. ! * This removes the compile-time dependency between the two files. ! * ! * Revision 1.4 2002/09/18 20:56:43 m-a ! * Let automake deal with the lexer. ! * ! * Revision 1.3 2002/09/16 18:58:14 m-a ! * Fix 'last line occasionally emitted twice' bug, cleaning up our yyinput(). ! * ! * Revision 1.2 2002/09/15 15:52:24 relson ! * ! * ! * Makefile.in: ! * - fix .l.c rule so that lexer_l.c is correctly generated from lexer_l.l ! * - added lexer_l.c to target mostlyclean-compile ! * - removed lexer_l.c from DIST_COMMON. As it can can be generated, it no longer needs to be distributed. ! * - added target lexertest (from original bogofilter release) ! * ! * lexer_l.l: ! * - defined global variable passthrough so that linking lexertest succeeds. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ %{ /* Index: lock.c =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/lock.c,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** lock.c 17 Sep 2002 06:23:46 -0000 1.2 --- lock.c 23 Sep 2002 11:31:53 -0000 1.3 *************** *** 1,11 **** /* $Id$ */ - /* $Log$ - /* Revision 1.2 2002/09/17 06:23:46 adrian_otto - /* Changed HAVE_FCNTL_H to HAVE_FCNTL and added safeguard chacks to make sure - /* the lock contants LOCK_EX and LOCK_SH get defined. /* ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ /****************************************************************************/ /* */ --- 1,15 ---- /* $Id$ */ /* ! * $Log$ ! * Revision 1.3 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.2 2002/09/17 06:23:46 adrian_otto ! * Changed HAVE_FCNTL_H to HAVE_FCNTL and added safeguard chacks to make sure ! * the lock contants LOCK_EX and LOCK_SH get defined. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ /****************************************************************************/ /* */ Index: lock.h =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/lock.h,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** lock.h 17 Sep 2002 06:23:46 -0000 1.2 --- lock.h 23 Sep 2002 11:31:53 -0000 1.3 *************** *** 1,11 **** /* $Id$ */ - /* $Log$ - /* Revision 1.2 2002/09/17 06:23:46 adrian_otto - /* Changed HAVE_FCNTL_H to HAVE_FCNTL and added safeguard chacks to make sure - /* the lock contants LOCK_EX and LOCK_SH get defined. /* ! /* Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! /* 0.7.3 Base Source ! /* */ /***************************************************************************** --- 1,15 ---- /* $Id$ */ /* ! * $Log$ ! * Revision 1.3 2002/09/23 11:31:53 m-a ! * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. ! * ! * Revision 1.2 2002/09/17 06:23:46 adrian_otto ! * Changed HAVE_FCNTL_H to HAVE_FCNTL and added safeguard chacks to make sure ! * the lock contants LOCK_EX and LOCK_SH get defined. ! * ! * Revision 1.1.1.1 2002/09/14 22:15:20 adrian_otto ! * 0.7.3 Base Source ! * */ /***************************************************************************** Index: main.c =================================================================== RCS file: /cvsroot/bogofilter/bogofilter/main.c,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** main.c 23 Sep 2002 11:27:25 -0000 1.8 --- main.c 23 Sep 2002 11:31:53 -0000 1.9 *************** *** 1,7 **** /* $Id$ */ - /* $Log$ - /* Revision 1.8 2002/09/23 11:27:25 m-a - /* Drop unused `inheaders' variable, unnest comments. /* * Revision 1.7 2002/09/23 10:08:49 m-a * Integrate patch by Zeph Hull and Clint Adams to present spamicity in --- 1,11 ---- /* $Id$ */ /* + * $Log$ + * Revision 1.9 2002/09/23 11:31:53 m-a + * Unnest comments, and move $ line down by one to prevent CVS from adding nested comments again. + * + * Revision 1.8 2002/09/23 11:27:25 m-a + * Drop unused `inheaders' variable, unnest comments. + * * Revision 1.7 2002/09/23 10:08:49 m-a * Integrate patch by Zeph Hull and Clint Adams to present spamicity in |