|
From: <jgr...@us...> - 2003-07-09 18:18:21
|
Update of /cvsroot/popfile/engine/Devel
In directory sc8-pr-cvs1:/tmp/cvs-serv2831/Devel
Added Files:
TestCoverage.pm
Log Message:
PERFORMANCE CHANGES
Bayes.pm: Added new add_messages_to_bucket API to add multiple messages to a
bucket at the same time with a single read/write of the appropriate
corpus table for speed.
New write_line__ method to write a line to a MSG file and optionally
to the parse_line API of MailParse.pm. Now we write a file to disk
and parse it without reloading the MSG file from disk for speed.
The MSG gets a temporary name until the CLS file is written to prevent
the history from reloading in the middle of a download ending up with
a message with a class file error
classify_file becomes classify and can classify either from a file
or from the preparsed information in the parser
classify_and_modify returns the name of the file where the message
was stored in addition to the classification.
HTML.pm: Use add_messages_to_bucket API to reclassification for speed.
Use the new classify method in Bayes.pm to classify a file after it
has been digested by the parser for colorization and get the word
scores. This means we only load the MSG file once (used to be
twice) and hence double the speed of viewing a colorized message.
New method load_disk_cache__ and save_disk_cache__ are used to
keep a copy of the history cache on disk between sessions so that
session start up is as fast as possible. There will be no need
to parse messages for header information on start up if the
history_cache file is present.
Removed the boundary feature because it is incompatible with the
concept of a "download" since we now send new history file messages
async. through the MQ.
Load the history cache progessively as files are written. The proxies
send the message NEWFL and the method new_history_file__ adds the
file to the history. This is done so that when the user hits the
History tab button after a mail download the history cache is
already loaded and there should be no delay in displaying the
history page.
MailParse.pm: Renamed parse_stream to parse_file since that's a better name
New start_parse, stop_parse and parse_line APIs so that a file can
be parsed line by line.
MQ.pm: Defined a new message type NEWFL which is used to indicate that
a file has been added to the history cache. NEWFL's message
is the name of the file (the MSG file) that was added.
POP3.pm: Send the NEWFL message through the pipe to the parent so that
the history is aware of new messages.
SMTP.pm:
NNTP.pm: Send CLASS and NEWFL messages through the pipe to the parent.
insert.pl: Updated to use new parse_file API
bayes.pl: Updated to use new classify not classify_file API.
TEST SUITE CHANGES
tests.pl: New test_assert_regexp function for doing fuzzy matching of
test results.
Returns 0 if all tests run successfully, and 1 if there are
any errors
TestLogger.tst: New file for testing POPFile::Logger functionality.
Makefile: The test target has a variable TESTARGS can be set with the
specific module (or modules using glob patterns) to run.
For example: gmake test TESTARGS='TestLogger'
There's a new coverage target to run the test suite and output
code coverage information for the modules used.
TestCoverage.pm: New module that provides line coverage information for
the test suite. Executed as a Perl debugger using the -d
switch and outputs code coverage information for all
POPFile files tested.
--- NEW FILE: TestCoverage.pm ---
# ---------------------------------------------------------------------------------------------
#
# Devel::TestCoverage - Module to measure code coverage in the test suite
#
# Copyright (c) 2001-2003 John Graham-Cumming
#
# ---------------------------------------------------------------------------------------------
package Devel::TestCoverage;
package DB;
# This hash will store a count of the number of times each line is executed # in each file,
# it is in fact a hash of hashes used as
# $count{filename}{linenumber}
my %count;
# This is called when we begin the code coverage (or debugging) session
BEGIN
{
# We want to look inside subroutines so tell the debugger to trace into
# them
$DB::trace = 1;
}
# Perl will call this function for every line of code it executes. We keep
# a count for each time a line is executed
sub DB
{
# The caller function we till us what line of code, in which file and
# package called us
my ($package, $file, $line) = caller;
# A specific line in a specific file just got executed, we remove
# certain references to eval code that we wont have traced into
$count{$file}{$line} += 1 if ( ( $file =~ /\(eval/ ) == 0 );
}
END
{
# This hash will map file names of POPFile modules to coverage
my %files;
# Print out information for each file
for my $file (keys %count)
{
if ( ( $file =~ /^[^\/]/ ) && ( $file ne 'tests.pl' ) ) {
my $current_line = 0;
open SOURCE_FILE, "<$file";
# Read in each line of the source file and keep track of whether
# it was executed or not using a new couple of keys in the
# %count hash for each file: total_lines, total_executable_lines
# and total_executed
while (<SOURCE_FILE>)
{
# Keep count of the total number of lines in this file
$current_line += 1;
$count{$file}{total_lines} += 1;
# We do not count lines that are blank or exclusively
# comments or just have braces on them or
# just an else or just a subroutine definition
if ( ( /^\s*\#/ == 0 ) && ( /^\s*$/ == 0 ) && ( /^\s*(\{|\}|else)\s*$/ == 0 ) && ( /^\s*sub \w+( \{)?\s*$/ == 0 ) )
{
$count{$file}{total_executable_lines} += 1;
# If this line was executed then keep count of
# that fact
if ( $count{$file}{$current_line} > 0 ) {
$count{$file}{total_executed} += 1;
}
}
}
$files{$file} = int(100 * $count{$file}{total_executed} / $count{$file}{total_executable_lines}) unless ( $count{$file}{total_executable_lines} == 0 );
close SOURCE_FILE;
}
}
foreach my $file (sort {$files{$b} <=> $files{$a}} keys %files) {
print sprintf( "Coverage of %-32s %d%%\n", "$file...", $files{$file});
}
}
1;
|