|
From: <jgr...@us...> - 2003-07-09 18:18:54
|
Update of /cvsroot/popfile/engine/Proxy
In directory sc8-pr-cvs1:/tmp/cvs-serv2831/Proxy
Modified Files:
NNTP.pm POP3.pm Proxy.pm SMTP.pm
Log Message:
PERFORMANCE CHANGES
Bayes.pm: Added new add_messages_to_bucket API to add multiple messages to a
bucket at the same time with a single read/write of the appropriate
corpus table for speed.
New write_line__ method to write a line to a MSG file and optionally
to the parse_line API of MailParse.pm. Now we write a file to disk
and parse it without reloading the MSG file from disk for speed.
The MSG gets a temporary name until the CLS file is written to prevent
the history from reloading in the middle of a download ending up with
a message with a class file error
classify_file becomes classify and can classify either from a file
or from the preparsed information in the parser
classify_and_modify returns the name of the file where the message
was stored in addition to the classification.
HTML.pm: Use add_messages_to_bucket API to reclassification for speed.
Use the new classify method in Bayes.pm to classify a file after it
has been digested by the parser for colorization and get the word
scores. This means we only load the MSG file once (used to be
twice) and hence double the speed of viewing a colorized message.
New method load_disk_cache__ and save_disk_cache__ are used to
keep a copy of the history cache on disk between sessions so that
session start up is as fast as possible. There will be no need
to parse messages for header information on start up if the
history_cache file is present.
Removed the boundary feature because it is incompatible with the
concept of a "download" since we now send new history file messages
async. through the MQ.
Load the history cache progessively as files are written. The proxies
send the message NEWFL and the method new_history_file__ adds the
file to the history. This is done so that when the user hits the
History tab button after a mail download the history cache is
already loaded and there should be no delay in displaying the
history page.
MailParse.pm: Renamed parse_stream to parse_file since that's a better name
New start_parse, stop_parse and parse_line APIs so that a file can
be parsed line by line.
MQ.pm: Defined a new message type NEWFL which is used to indicate that
a file has been added to the history cache. NEWFL's message
is the name of the file (the MSG file) that was added.
POP3.pm: Send the NEWFL message through the pipe to the parent so that
the history is aware of new messages.
SMTP.pm:
NNTP.pm: Send CLASS and NEWFL messages through the pipe to the parent.
insert.pl: Updated to use new parse_file API
bayes.pl: Updated to use new classify not classify_file API.
TEST SUITE CHANGES
tests.pl: New test_assert_regexp function for doing fuzzy matching of
test results.
Returns 0 if all tests run successfully, and 1 if there are
any errors
TestLogger.tst: New file for testing POPFile::Logger functionality.
Makefile: The test target has a variable TESTARGS can be set with the
specific module (or modules using glob patterns) to run.
For example: gmake test TESTARGS='TestLogger'
There's a new coverage target to run the test suite and output
code coverage information for the modules used.
TestCoverage.pm: New module that provides line coverage information for
the test suite. Executed as a Perl debugger using the -d
switch and outputs code coverage information for all
POPFile files tested.
Index: NNTP.pm
===================================================================
RCS file: /cvsroot/popfile/engine/Proxy/NNTP.pm,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -d -r1.11 -r1.12
*** NNTP.pm 14 Jun 2003 21:10:12 -0000 1.11
--- NNTP.pm 9 Jul 2003 18:18:21 -0000 1.12
***************
*** 223,231 ****
$count += 1;
! my $class = $self->{classifier__}->classify_and_modify( $news, $client, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
! print $pipe "$class$eol";
}
--- 223,232 ----
$count += 1;
! my ( $class, $history_file ) = $self->{classifier__}->classify_and_modify( $news, $client, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
! print $pipe "CLASS:$class$eol";
! print $pipe "NEWFL:$history_file$eol";
}
Index: POP3.pm
===================================================================
RCS file: /cvsroot/popfile/engine/Proxy/POP3.pm,v
retrieving revision 1.60
retrieving revision 1.61
diff -C2 -d -r1.60 -r1.61
*** POP3.pm 27 Jun 2003 06:52:23 -0000 1.60
--- POP3.pm 9 Jul 2003 18:18:21 -0000 1.61
***************
*** 285,289 ****
# Classify without echoing to client, saving file for later RETR's
! my $class = $self->{classifier__}->classify_and_modify( $mail, $client, $download_count, $count, 0, '', 0 );
$downloaded{$count} = 1;
--- 285,289 ----
# Classify without echoing to client, saving file for later RETR's
! my ( $class, $history_file ) = $self->{classifier__}->classify_and_modify( $mail, $client, $download_count, $count, 0, '', 0 );
$downloaded{$count} = 1;
***************
*** 297,300 ****
--- 297,301 ----
# Tell the parent that we just handled a mail
print $pipe "CLASS:$class$eol";
+ print $pipe "NEWFL:$history_file$eol";
}
}
***************
*** 397,404 ****
# we echo each line of the message until we hit the . at the end
if ( $self->echo_response_($mail, $client, $command ) ) {
! $class = $self->{classifier__}->classify_and_modify( $mail, $client, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
print $pipe "CLASS:$class$eol";
# Note locally that file has been retrieved
--- 398,407 ----
# we echo each line of the message until we hit the . at the end
if ( $self->echo_response_($mail, $client, $command ) ) {
! my $history_file;
! ( $class, $history_file ) = $self->{classifier__}->classify_and_modify( $mail, $client, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
print $pipe "CLASS:$class$eol";
+ print $pipe "NEWFL:$history_file$eol";
# Note locally that file has been retrieved
Index: Proxy.pm
===================================================================
RCS file: /cvsroot/popfile/engine/Proxy/Proxy.pm,v
retrieving revision 1.17
retrieving revision 1.18
diff -C2 -d -r1.17 -r1.18
*** Proxy.pm 14 Jun 2003 21:10:12 -0000 1.17
--- Proxy.pm 9 Jul 2003 18:18:21 -0000 1.18
***************
*** 204,207 ****
--- 204,211 ----
}
+ if ( $message =~ /NEWFL:(.*)/ ) {
+ $self->mq_post_( 'NEWFL', $1, '' );
+ }
+
if ( $message =~ /LOGIN:(.*)/ ) {
$self->mq_post_( 'LOGIN', $1, '' );
***************
*** 347,351 ****
last if ( $self->{alive_} == 0 );
!
if (!defined($suppress) || !( $_ =~ $suppress )) {
if (!$verbose) {
--- 351,355 ----
last if ( $self->{alive_} == 0 );
!
if (!defined($suppress) || !( $_ =~ $suppress )) {
if (!$verbose) {
***************
*** 353,362 ****
} else {
# This creates log output
!
$self->tee_($client, $_);
}
! } else {
$self->log_("Suppressed: $_");
! }
last if ( $_ =~ $regexp );
--- 357,366 ----
} else {
# This creates log output
!
$self->tee_($client, $_);
}
! } else {
$self->log_("Suppressed: $_");
! }
last if ( $_ =~ $regexp );
Index: SMTP.pm
===================================================================
RCS file: /cvsroot/popfile/engine/Proxy/SMTP.pm,v
retrieving revision 1.13
retrieving revision 1.14
diff -C2 -d -r1.13 -r1.14
*** SMTP.pm 14 Jun 2003 21:10:12 -0000 1.13
--- SMTP.pm 9 Jul 2003 18:18:21 -0000 1.14
***************
*** 187,191 ****
next;
}
-
if ( ( $command =~ /MAIL FROM:/i ) ||
--- 187,190 ----
***************
*** 207,214 ****
$count += 1;
! my $class = $self->{classifier__}->classify_and_modify( $client, $mail, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
! print $pipe "$class$eol";
my $response = <$mail>;
--- 206,214 ----
$count += 1;
! my ( $class, $history_file ) = $self->{classifier__}->classify_and_modify( $client, $mail, $download_count, $count, 0, '' );
# Tell the parent that we just handled a mail
! print $pipe "CLASS:$class$eol";
! print $pipe "NEWFL:$history_file$eol";
my $response = <$mail>;
|