|
From: <jgr...@us...> - 2003-07-09 18:18:42
|
Update of /cvsroot/popfile/engine
In directory sc8-pr-cvs1:/tmp/cvs-serv2831
Modified Files:
Makefile bayes.pl insert.pl tests.pl
Log Message:
PERFORMANCE CHANGES
Bayes.pm: Added new add_messages_to_bucket API to add multiple messages to a
bucket at the same time with a single read/write of the appropriate
corpus table for speed.
New write_line__ method to write a line to a MSG file and optionally
to the parse_line API of MailParse.pm. Now we write a file to disk
and parse it without reloading the MSG file from disk for speed.
The MSG gets a temporary name until the CLS file is written to prevent
the history from reloading in the middle of a download ending up with
a message with a class file error
classify_file becomes classify and can classify either from a file
or from the preparsed information in the parser
classify_and_modify returns the name of the file where the message
was stored in addition to the classification.
HTML.pm: Use add_messages_to_bucket API to reclassification for speed.
Use the new classify method in Bayes.pm to classify a file after it
has been digested by the parser for colorization and get the word
scores. This means we only load the MSG file once (used to be
twice) and hence double the speed of viewing a colorized message.
New method load_disk_cache__ and save_disk_cache__ are used to
keep a copy of the history cache on disk between sessions so that
session start up is as fast as possible. There will be no need
to parse messages for header information on start up if the
history_cache file is present.
Removed the boundary feature because it is incompatible with the
concept of a "download" since we now send new history file messages
async. through the MQ.
Load the history cache progessively as files are written. The proxies
send the message NEWFL and the method new_history_file__ adds the
file to the history. This is done so that when the user hits the
History tab button after a mail download the history cache is
already loaded and there should be no delay in displaying the
history page.
MailParse.pm: Renamed parse_stream to parse_file since that's a better name
New start_parse, stop_parse and parse_line APIs so that a file can
be parsed line by line.
MQ.pm: Defined a new message type NEWFL which is used to indicate that
a file has been added to the history cache. NEWFL's message
is the name of the file (the MSG file) that was added.
POP3.pm: Send the NEWFL message through the pipe to the parent so that
the history is aware of new messages.
SMTP.pm:
NNTP.pm: Send CLASS and NEWFL messages through the pipe to the parent.
insert.pl: Updated to use new parse_file API
bayes.pl: Updated to use new classify not classify_file API.
TEST SUITE CHANGES
tests.pl: New test_assert_regexp function for doing fuzzy matching of
test results.
Returns 0 if all tests run successfully, and 1 if there are
any errors
TestLogger.tst: New file for testing POPFile::Logger functionality.
Makefile: The test target has a variable TESTARGS can be set with the
specific module (or modules using glob patterns) to run.
For example: gmake test TESTARGS='TestLogger'
There's a new coverage target to run the test suite and output
code coverage information for the modules used.
TestCoverage.pm: New module that provides line coverage information for
the test suite. Executed as a Perl debugger using the -d
switch and outputs code coverage information for all
POPFile files tested.
Index: Makefile
===================================================================
RCS file: /cvsroot/popfile/engine/Makefile,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -d -r1.11 -r1.12
*** Makefile 26 Jun 2003 17:41:37 -0000 1.11
--- Makefile 9 Jul 2003 18:18:09 -0000 1.12
***************
*** 11,25 ****
error:
! @echo Must specify one of test, package or windows
@echo
! @echo "test - Run POPFile test suite"
! @echo "windows - Build Windows installer"
! @echo "package - Build Windows installer, and create"
! @echo " ZIP files for Windows and cross-platform"
! @echo " version"
# test runs the POPFile unit test suite
! test: ; @perl tests.pl
# windows builds the Windows installer
--- 11,40 ----
error:
! @echo Must specify one of coverage, test, package or windows
@echo
! @echo "coverage - Run POPFile test suite with coverage information"
! @echo "test - Run POPFile test suite"
! @echo " (test and coverage pass the TESTARGS variable to the test suite)"
! @echo
! @echo "windows - Build Windows installer"
! @echo "package - Build Windows installer, and create"
! @echo " ZIP files for Windows and cross-platform"
! @echo " version"
# test runs the POPFile unit test suite
! coverage:
! @echo Running test suite with code coverage
! ifdef TESTARGS
! @echo with arguments '$(TESTARGS)'
! endif
! @perl -d:TestCoverage tests.pl $(TESTARGS)
!
! test:
! @echo Running test suite
! ifdef TESTARGS
! @echo with arguments '$(TESTARGS)'
! endif
! @perl tests.pl $(TESTARGS)
# windows builds the Windows installer
Index: bayes.pl
===================================================================
RCS file: /cvsroot/popfile/engine/bayes.pl,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -d -r1.18 -r1.19
*** bayes.pl 6 Jul 2003 01:11:47 -0000 1.18
--- bayes.pl 9 Jul 2003 18:18:09 -0000 1.19
***************
*** 50,54 ****
foreach my $file (@files)
{
! print "$file is '" . $b->classify_file($file) . "'\n";
}
--- 50,54 ----
foreach my $file (@files)
{
! print "$file is '" . $b->classify($file) . "'\n";
}
Index: insert.pl
===================================================================
RCS file: /cvsroot/popfile/engine/insert.pl,v
retrieving revision 1.21
retrieving revision 1.22
diff -C2 -d -r1.21 -r1.22
*** insert.pl 14 Jun 2003 21:10:12 -0000 1.21
--- insert.pl 9 Jul 2003 18:18:09 -0000 1.22
***************
*** 103,107 ****
print "Parsing message '$message'...\n";
! $parser->parse_stream($message);
foreach $word (keys %{$parser->{words__}}) {
--- 103,107 ----
print "Parsing message '$message'...\n";
! $parser->parse_file($message);
foreach $word (keys %{$parser->{words__}}) {
Index: tests.pl
===================================================================
RCS file: /cvsroot/popfile/engine/tests.pl,v
retrieving revision 1.16
retrieving revision 1.17
diff -C2 -d -r1.16 -r1.17
*** tests.pl 23 Jun 2003 22:34:10 -0000 1.16
--- tests.pl 9 Jul 2003 18:18:09 -0000 1.17
***************
*** 106,109 ****
--- 106,134 ----
}
+ # ---------------------------------------------------------------------------------------------
+ #
+ # test_assert_regexp - Perform a test and assert that its result matches a regexp
+ #
+ # $file The name of the file invoking the test
+ # $line The line in the $file where the test can be found
+ # $test The result of the test that was just run
+ # $expected The expected result in the form of a regexp
+ # $context (Optional) String containing extra context information
+ #
+ # Example: test_assert_regexp( function(parameter), '^result' )
+ # Example: test_assert_regexp( function(parameter), 3, 'Banana.+subsystem' )
+ #
+ # YOU DO NOT NEED TO GIVE THE $file and $line parameters as this script supplies them
+ # automatically
+ # ---------------------------------------------------------------------------------------------
+
+ sub test_assert_regexp
+ {
+ my ( $file, $line, $test, $expected, $context ) = @_;
+ my $result = ( $test =~ /$expected/ );
+
+ test_report( $result, "expecting to match $expected and got $test", $file, $line, $context );
+ }
+
# MAIN
***************
*** 118,121 ****
--- 143,148 ----
$pattern = "$ARGV[0].*" if ( $#ARGV == 0 );
+ my $code = 0;
+
foreach my $test (@tests) {
***************
*** 138,141 ****
--- 165,169 ----
my $line = $_;
$ln += 1;
+ $line =~ s/(test_assert_regexp\()/$1 '$test', $ln,/g;
$line =~ s/(test_assert_equal\()/$1 '$test', $ln,/g;
$line =~ s/(test_assert\()/$1 '$test', $ln,/g;
***************
*** 148,151 ****
--- 176,180 ----
print "failed (" . ( $test_count - $current_test_count ) . " ok, " . ( $test_failures - $current_error_count ) . " failed)\n";
print $fail_messages . "\n";
+ $code = 1;
} else {
print "ok (" . ( $test_count - $current_test_count ) . " ok)";
***************
*** 155,156 ****
--- 184,186 ----
print "\n\n$test_count tests, " . ( $test_count - $test_failures ) . " ok, $test_failures failed\n\n";
+ exit $code;
|