[popfile-commit] engine/UI HTML.pm,1.214,1.215

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/popfile/engine/UI
In directory sc8-pr-cvs1:/tmp/cvs-serv28571/UI

Modified Files:
	HTML.pm 
Log Message:
FINAL PREPARATIONS FOR V0.20.0

Bring test suite to as close to 100% coverage as possible
Clean up Japanese/Korean code for better maintainability
Fix problem with random test suite crashes

Proxy/POP3.pm:

Add code to upgrade the welcome_string if the user has not
changed it from the default in a previous version of POPFile.

Classifier/Bayes.pm:

Remove a lot of code that was duplicated because of
the addition of Japanese and Korean support.  To do
so created two helper methods: add_words_to_bucket__
and magnet_match__.  There's still some duplicated code
(e.g. calls to these functions) but I can't find a good
way to deal with 'no locale' other than this.

Add hints for the code coverage about blocks of code.

Change the word scores code so that the scores are not
calculated if not required.

UI/HTML.pm:

Handle the display and non-display of the word matrix,
remove the stickiness of the word table format since 
POPFile has not other 'sticky' values at this point.

Remove duplicated code introduced by Japanese and
Korean support.

tests.pl:

Accept multiple patterns on the command line so that
multiple tests can be specified at once.  Patterns are
separated by commas.  e.g. to run the HTTP and MailParse
test suites do:

gmake test TESTARGS=HTTP,MailParse

license:

Incorporate information about the BerkeleyDB license.

tests/TestWordMangle.tst:

Add tests for Japanese stop word support.

tests/TestModule.tst:

Add calls to dummy parent methods that do nothing, done
to get coverage to 100%.

tests/TestMailParse.tst
tests/TestPOP3.tst
tests/TestHTML.tst:

Make sure to stop the Bayes module to close the database.

tests/TestHTML.script:

Add tests for the new Single Message View where the word
matrix is not expanded by default.  Test with and without
the word matrix and test the different views.

Added tests for Japanese and Korean stop words.

tests/TestProxy.tst:

Add tests for echo_response_'s handling of timeouts.

tests/languages/Korean.msg
tests/languages/Nihongo.msg:

Added the Korean and Japanese language files.

tests/TestMailParse022.cam
tests/TestMailParse022.wrd
tests/TestMailParse022.msg:

Split the From and Subject line to check the new
long header support.

tests/TestMailParse019.clr
tests/TestMailParse015.clr:

The information in this appeared to be wrong, so updated
to the latest output.

tests/TestMailParse.tst:

Add TestMailParse019 to the colorization tests.

Index: HTML.pm
===================================================================
RCS file: /cvsroot/popfile/engine/UI/HTML.pm,v
retrieving revision 1.214
retrieving revision 1.215
diff -C2 -d -r1.214 -r1.215
*** HTML.pm	10 Oct 2003 20:03:19 -0000	1.214
--- HTML.pm	13 Oct 2003 20:23:40 -0000	1.215
***************
*** 220,225 ****

      # This setting defines what is displayed in the word matrix: 'freq' for frequencies,
!     # 'prob' for probabilities, 'score' for logarithmic scores.
!     $self->config_( 'wordtable_format', 'prob' );

      # Load skins
--- 220,232 ----

      # This setting defines what is displayed in the word matrix: 'freq' for frequencies,
!     # 'prob' for probabilities, 'score' for logarithmic scores, if blank then the word
!     # table is not shown
! 
!     $self->config_( 'wordtable_format', '' );
! 
!     # This setting determines whether when viewing an individual message we show the word
!     # table or not
! 
!     $self->config_( 'show_wordtable', 0 );

      # Load skins
***************
*** 283,289 ****
      # Set the classifier option wmformat__ according to our wordtable_format
      # option.
!   
      $self->{classifier__}->wmformat( $self->config_( 'wordtable_format' ) );
!   
      return $self->SUPER::start();
  }
--- 290,296 ----
      # Set the classifier option wmformat__ according to our wordtable_format
      # option.
! 
      $self->{classifier__}->wmformat( $self->config_( 'wordtable_format' ) );
! 
      return $self->SUPER::start();
  }
***************
*** 1355,1467 ****
      my @words = $self->{classifier__}->get_stopword_list();

!     # In Japanese mode, disable locale.
!     # Sorting Japanese with "use locale" is memory and time consuming,
!     # and may cause perl crash.
! 
!     if ( $self->config_( 'language' ) eq 'Nihongo' ) {
!         no locale;
!         for my $word (sort @words) {
! 
!             # First character of stop word is EUC-JP in Japanese mode
! 
!             $word =~ /^($euc_jp)/;
! 
!             if ( $1 ne $last )  {
!                 if ( !$firstRow ) {
!                     $body .= "</td></tr>\n";
!                 } else {
!                     $firstRow = 0;
!                 }
!                 $body .= "<tr><th scope=\"row\" class=\"advancedAlphabet";
!                 if ( $groupCounter == $groupSize ) {
!                     $body .= "GroupSpacing";
!                 }
!                 $body .= "\"><b>$1</b></th>\n";
!                 $body .= "<td class=\"advancedWords";
!                 if ( $groupCounter == $groupSize ) {
!                     $body .= "GroupSpacing";
!                     $groupCounter = 0;
!                 }
!                 $body .= "\">";
!                 $last = $1;
!                 $need_comma = 0;
!                 $groupCounter += 1;
!             }
!             if ( $need_comma == 1 ) {
!                 $body .= ", $word";
!             } else {
!                 $body .= $word;
!                 $need_comma = 1;
              }
          }
-     } else {
-         if ( $self->config_( 'language' ) eq 'Korean' ) {
-     	    # don't use locale in Korean mode. Every other code is same
-     	    no locale;
-             for my $word (sort @words) {
-                 $word =~ /^(.)/;
- 
-                 if ( $1 ne $last )  {
-                     if (! $firstRow) {
-                         $body .= "</td></tr>\n";
-                     } else {
-                         $firstRow = 0;
-                     }
-                     $body .= "<tr><th scope=\"row\" class=\"advancedAlphabet";
-                     if ($groupCounter == $groupSize) {
-                         $body .= "GroupSpacing";
-                     }
-                     $body .= "\"><b>$1</b></th>\n";
-                     $body .= "<td class=\"advancedWords";
-                     if ($groupCounter == $groupSize) {
-                         $body .= "GroupSpacing";
-                         $groupCounter = 0;
-                     }
-                     $body .= "\">";

!                     $last = $1;
!                     $need_comma = 0;
!                     $groupCounter += 1;
!                 }
!                 if ( $need_comma == 1 ) {
!                     $body .= ", $word";
!                 } else {
!                     $body .= $word;
!                     $need_comma = 1;
!                 }
              }
!         } else {
!             for my $word (sort @words) {
!                 $word =~ /^(.)/;
! 
!                 if ( $1 ne $last )  {
!                     if (! $firstRow) {
!                         $body .= "</td></tr>\n";
!                     } else {
!                         $firstRow = 0;
!                     }
!                     $body .= "<tr><th scope=\"row\" class=\"advancedAlphabet";
!                     if ($groupCounter == $groupSize) {
!                         $body .= "GroupSpacing";
!                     }
!                     $body .= "\"><b>$1</b></th>\n";
!                     $body .= "<td class=\"advancedWords";
!                     if ($groupCounter == $groupSize) {
!                         $body .= "GroupSpacing";
!                         $groupCounter = 0;
!                     }
!                     $body .= "\">";
! 
!                     $last = $1;
!                     $need_comma = 0;
!                     $groupCounter += 1;
!                 }
!                 if ( $need_comma == 1 ) {
!                     $body .= ", $word";
!                 } else {
!                     $body .= $word;
!                     $need_comma = 1;
!                 }
              }
          }
      }
--- 1362,1408 ----
      my @words = $self->{classifier__}->get_stopword_list();

!     for my $word (sort @words) {
!         my $c;
!         if ( $self->config_( 'language' ) =~ /^Korean$/ ) {
!             no locale;
!             $word =~ /^(.)/;
!             $c = $1;
! 	} else {
!     	    if ( $self->config_( 'language' ) =~ /^Nihongo$/ ) {
!                no locale;
!                $word =~ /^($euc_jp)/;
!                $c = $1;
! 	    } else {
!                $word =~ /^(.)/;
!                $c = $1;
              }
          }

!         if ( $c ne $last ) {
!             if ( !$firstRow ) {
!                 $body .= "</td></tr>\n";
!             } else {
!                 $firstRow = 0;
              }
!             $body .= "<tr><th scope=\"row\" class=\"advancedAlphabet";
!             if ( $groupCounter == $groupSize ) {
!                 $body .= "GroupSpacing";
!             }
!             $body .= "\"><b>$c</b></th>\n";
!             $body .= "<td class=\"advancedWords";
!             if ( $groupCounter == $groupSize ) {
!                 $body .= "GroupSpacing";
!                 $groupCounter = 0;
              }
+             $body .= "\">";
+             $last = $c;
+             $need_comma = 0;
+             $groupCounter += 1;
+         }
+         if ( $need_comma == 1 ) {
+             $body .= ", $word";
+         } else {
+             $body .= $word;
+             $need_comma = 1;
          }
      }
***************
*** 2675,2700 ****
      if ( open MAIL, '<'. $self->global_config_( 'msgdir' ) . $file ) {
          while ( <MAIL> )  {
!             last          if ( /^(\r\n|\r|\n)/ );

              # Support long header that has more than 2 lines

!             if(/^[\t ]+(=\?[\w-]+\?[BQ]\?.*\?=.*)/){
!                 if($long_header eq 'from'){
                      $from .= $1;
                      next;
                  }
!                 if($long_header eq 'subject'){
                      $subject .= $1;
                      next;
                  }
!             }else{
!                 if(/^From: *(.*)/i){
                      $long_header = 'from';
                      $from = $1;
                      next;
!                 }elsif (/^Subject: *(.*)/i){
!                     $long_header = 'subject';
!                     $subject = $1;
!                     next;
                  }
                  $long_header = '';
--- 2616,2644 ----
      if ( open MAIL, '<'. $self->global_config_( 'msgdir' ) . $file ) {
          while ( <MAIL> )  {
!             last if ( /^(\r\n|\r|\n)/ );

              # Support long header that has more than 2 lines

!             if ( /^[\t ]+(=\?[\w-]+\?[BQ]\?.*\?=.*)/ ) {
!                 if ( $long_header eq 'from' ) {
                      $from .= $1;
                      next;
                  }
! 
!                 if ( $long_header eq 'subject' ) {
                      $subject .= $1;
                      next;
                  }
!             } else {
!                 if ( /^From: *(.*)/i ) {
                      $long_header = 'from';
                      $from = $1;
                      next;
!                 } else {
!                     if ( /^Subject: *(.*)/i ) {
!                         $long_header = 'subject';
!                         $subject = $1;
!                         next;
! 		    }
                  }
                  $long_header = '';
***************
*** 2731,2735 ****
          # Do not truncate at 39 if the last char is the first byte of DBCS char(pair of two bytes).
          # Truncate it 1 byte shorter.
!         if (( $self->config_( 'language' ) eq 'Korean' ) || ( $self->config_( 'language' ) eq 'Nihongo' )) {
              $short_subject = $1;
              $short_subject =~ s/(([\x80-\xff].)*)[\x80-\xff]?$/$1/;
--- 2675,2679 ----
          # Do not truncate at 39 if the last char is the first byte of DBCS char(pair of two bytes).
          # Truncate it 1 byte shorter.
!         if ( $self->config_( 'language' ) =~ /^Korean|Nihongo$/ ) {
              $short_subject = $1;
              $short_subject =~ s/(([\x80-\xff].)*)[\x80-\xff]?$/$1/;
***************
*** 3445,3461 ****
      $self->{form_}{search} = '' if ( !defined( $self->{form_}{search} ) );
      $self->{form_}{filter} = '' if ( !defined( $self->{form_}{filter} ) );
!     $self->{form_}{format} = '' if ( !defined( $self->{form_}{format} ) );
!   
      # If a format change was requested for the word matrix, record it in the
      # configuration and in the classifier options.
!   
!     if ( $self->{form_}{format} ne '' ) {
!         $self->config_( 'wordtable_format', $self->{form_}{format} );
!         $self->{classifier__}->wmformat( $self->{form_}{format} );
!     }

      my $index = -1;

!    foreach my $i ( 0 .. $self->history_size()-1 ) {
          if ( $self->{history_keys__}[$i] eq $mail_file ) {
              use integer;
--- 3389,3402 ----
      $self->{form_}{search} = '' if ( !defined( $self->{form_}{search} ) );
      $self->{form_}{filter} = '' if ( !defined( $self->{form_}{filter} ) );
!     $self->{form_}{format} = $self->config_( 'wordtable_format' ) if ( !defined( $self->{form_}{format} ) );
! 
      # If a format change was requested for the word matrix, record it in the
      # configuration and in the classifier options.
! 
!     $self->{classifier__}->wmformat( $self->{form_}{format} );

      my $index = -1;

!     foreach my $i ( 0 .. $self->history_size()-1 ) {
          if ( $self->{history_keys__}[$i] eq $mail_file ) {
              use integer;
***************
*** 3553,3565 ****

          my $view = $self->{language__}{View_WordProbabilities};
!         if ( $self->config_( 'wordtable_format' ) eq 'freq' ) {
              $view = $self->{language__}{View_WordFrequencies};
  	}
!         if ( $self->config_( 'wordtable_format' ) eq 'score' ) {
              $view = $self->{language__}{View_WordScores};
  	}

!         $fmtlinks = "<table width=\"100%\">\n<td class=\"top20\" align=\"left\"><b>$self->{language__}{View_WordMatrix} ($view)</b></td>\n<td class=\"historyNavigatorTop\">\n";
!         if ($self->config_( 'wordtable_format' ) ne 'freq' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
--- 3494,3508 ----

          my $view = $self->{language__}{View_WordProbabilities};
!         if ( $self->{form_}{format} eq 'freq' ) {
              $view = $self->{language__}{View_WordFrequencies};
  	}
!         if ( $self->{form_}{format} eq 'score' ) {
              $view = $self->{language__}{View_WordScores};
  	}

!         if ( $self->{form_}{format} ne '' ) {
!             $fmtlinks = "<table width=\"100%\">\n<td class=\"top20\" align=\"left\"><b>$self->{language__}{View_WordMatrix} ($view)</b></td>\n<td class=\"historyNavigatorTop\">\n";
! 	}
!         if ($self->{form_}{format} ne 'freq' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
***************
*** 3568,3572 ****
              $fmtlinks .= "</a> &nbsp;\n";
          }
!         if ($self->config_( 'wordtable_format' ) ne 'prob' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
--- 3511,3515 ----
              $fmtlinks .= "</a> &nbsp;\n";
          }
!         if ($self->{form_}{format} ne 'prob' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
***************
*** 3575,3579 ****
              $fmtlinks .= "</a> &nbsp;\n";
          }
!         if ($self->config_( 'wordtable_format' ) ne 'score' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
--- 3518,3522 ----
              $fmtlinks .= "</a> &nbsp;\n";
          }
!         if ($self->{form_}{format} ne 'score' ) {
              $fmtlinks .= "<a href=\"/view?view=" . $self->{history_keys__}[ $index ];
              $fmtlinks .= "&start_message=". ((( $index ) >= $start_message )?$start_message:($start_message - $self->config_( 'page_size' )));
***************
*** 3582,3587 ****
              $fmtlinks .= "</a> \n";
          }
!         $fmtlinks .= "</a></td></table>";
!   
          # Enable saving of word-scores

--- 3525,3532 ----
              $fmtlinks .= "</a> \n";
          }
!         if ( $self->{form_}{format} ne '' ) {
!             $fmtlinks .= "</a></td></table>";
! 	}
! 
          # Enable saving of word-scores

***************
*** 3640,3652 ****

      if ($self->{history__}{$mail_file}{magnet} eq '') {
!         my $score_text = $self->{classifier__}->scores();
!         $score_text =~ s/\<\!--format--\>/$fmtlinks/;
!         $body .= $score_text;
!         $self->{classifier__}->scores('');
      } else {
!         $body .= sprintf(   $self->{language__}{History_MagnetBecause},                                # PROFILE BLOCK START
!                             $color, $bucket,
!                             Classifier::MailParse::splitline($self->{history__}{$mail_file}{magnet},0)
!                             );                                                                         # PROFILE BLOCK STOP
      }

--- 3585,3597 ----

      if ($self->{history__}{$mail_file}{magnet} eq '') {
!          my $score_text = $self->{classifier__}->scores();
!          $score_text =~ s/\<\!--format--\>/$fmtlinks/;
!          $body .= $score_text;
!          $self->{classifier__}->scores('');
      } else {
!         $body .= sprintf( $self->{language__}{History_MagnetBecause},                                # PROFILE BLOCK START
!                           $color, $bucket,
!                           Classifier::MailParse::splitline($self->{history__}{$mail_file}{magnet},0)
!                           );                                                                         # PROFILE BLOCK STOP
      }