Update of /cvsroot/popfile/engine/POPFile
In directory sc8-pr-cvs1:/tmp/cvs-serv27765/POPFile
Modified Files:
Loader.pm Module.pm
Log Message:
PORT TO STORE CORPUS IN BERKELEYDB DATABASES
Bayes.pm:
The $self->{matrix__} hash is now a collection of tied hashes
to BerkeleyDB databases named table.db in each of the corpus
bucket subdirectories. The set_value_ and get_value_ accessors
have been modified to access the database. load_word_matrix_
and load_bucket_ now load the bucket information from the database
in concurrent mode.
prefork, forked and postfork handling closing and opening database
connections around forks to ensure that there are no threading
problems with the database.
close_database__ can be called to clean up the connection to the
database at any time.
Many API functions have been modified internally to use the new
structure. The external APIs have not changed. get_bucket_word_list
is currently not implemented.
load_bucket_ does automatic upgrade from the old flat file style
of corpus to the database.
HTML.pm:
Added a note that since get_bucket_word_list isn't working it is
not possible to view the words in a bucket.
Module.pm:
Added description and base implementation of the new postfork()
method that is called on all modules when a fork has occurred and
in the parent process. This is the parent equivalent of forked().
Loader.pm:
The forker is modified to call postfork() in the parent process
after a successful fork.
Index: Loader.pm
===================================================================
RCS file: /cvsroot/popfile/engine/POPFile/Loader.pm,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** Loader.pm 31 Jul 2003 16:32:21 -0000 1.5
--- Loader.pm 10 Sep 2003 03:54:15 -0000 1.6
***************
*** 257,260 ****
--- 257,266 ----
# process
+ foreach my $type (keys %{$self->{components__}}) {
+ foreach my $name (keys %{$self->{components__}{$type}}) {
+ $self->{components__}{$type}{$name}->postfork();
+ }
+ }
+
close $writer;
return ($pid, $reader);
Index: Module.pm
===================================================================
RCS file: /cvsroot/popfile/engine/POPFile/Module.pm,v
retrieving revision 1.11
retrieving revision 1.12
diff -C2 -d -r1.11 -r1.12
*** Module.pm 31 Jul 2003 16:32:21 -0000 1.11
--- Module.pm 10 Sep 2003 03:54:15 -0000 1.12
***************
*** 50,53 ****
--- 50,56 ----
# process and should be used to clean up
#
+ # postfork() - called in the parent process to tell it that the fork has occurred. This is
+ # like forked but in the parent
+ #
# reaper() - called when a process has terminated to give a module a chance to do
# whatever clean up is needed
***************
*** 256,259 ****
--- 259,277 ----
# ---------------------------------------------------------------------------------------------
sub forked
+ {
+ my ( $self ) = @_;
+ }
+
+ # ---------------------------------------------------------------------------------------------
+ #
+ # postfork
+ #
+ # This is called when some module has just forked POPFile. It is called in the parent
+ # process.
+ #
+ # There is no return value from this method
+ #
+ # ---------------------------------------------------------------------------------------------
+ sub postfork
{
my ( $self ) = @_;
|