|
From: John Graham-C. <jg...@jg...> - 2003-06-11 15:10:44
|
INTRODUCTION
For the first time in POPFile history we are releasing two versions at the
same time. Please take a moment to decide which version is most appropriate
for you.
v0.18.2 is a minor bug fix update to the last v0.18.1 made in February and
contains no new features. It is intended for conservative users who are
happy with POPFile as it stands and would just like bug fixes and nothing
else.
v0.19.0 is a major update to v0.18.1 and include everything that is fixed
in v0.18.2 plus much more. Full details are below.
A special thank you goes out to Sam Schinke who has contributed a great
deal to this release particularly with his work fixing outstanding bugs and
porting them from v0.19.0 and v0.18.2. Nice work Sam.
Brian Smith also needs your applause for major updates to the Windows
installer
making it easier than ever to install and configure POPFile on Windows.
I recommend v0.19.0 because of the many improvements, but v0.18.2 is the
STABLE release for those who want rock solid reliability and want others to
do the bug hunting on the new v0.19.x line :-)
(Of course if you are really conservative you might like to download and use
v0.18.1 which has been in use by many people since February 27).
ESSENTIAL READING IF YOU ARE UPGRADING TO v0.19.0
1. BACK UP YOUR OLD INSTALLATION: POPFile makes this really easy, just copy
the entire POPFile directory somewhere. You can then safely install
POPFile v0.19.0 on top of your current installation; I just think a back
up
is a sensible precaution.
2. ACCURACY MIGHT DROP FOR A SHORT WHILE: because of some changes made in
the
mail parser it is possible that you might see accuracy drop initially and
you may find yourself reclassifying a few messages that used to work.
This
is unfortunate but necessary to make POPFile even more accurate than
before
and v0.19.0 incorporates changes that make POPFile's classification
accuracy better; however old corpuses might need a little retraining.
DOWNLOADING
You can obtain the latest releases of POPFile by visiting
http://sourceforge.net/project/showfiles.php?group_id=63137
UPGRADING
Just install POPFile on top of the currently installed version. But did you
read the ESSENTIAL READING above first if you are upgrading from a pre
v0.19.0
version?
FAQ
zonk3r has spent a great deal of time on a POPFile FAQ. Please check it out
as it covers many questions that you might have:
http://sourceforge.net/docman/display_doc.php?docid=14421&group_id=63137
THE GORY DETAILS FOR v0.18.2
1. Fixed bug where -toptoo (now -pop3_toptoo) would cause duplicates in
the history.
Reports: 701390 () (fixed by sschinke)
2. Fixed a bug where -toptoo resulted in message overwriting and the wrong
message classification.
Reports: 705448, 694002 () (fixed by sschinke)
3. Fix warning on unclassified messages
Reports: 697278 (jdeifik)
4. Remove lines in headers consisting of only whitespace (Eudora has
trouble with them).
Patch: esniper
Reports: 701981 (esniper)
5. Fix colorized display of quoted-printable HTML
Reports: 699098 (mfichtner)
6. Recognize protocol-less href's
Reports: biljir
7. Re-work of header handling to handle multi-line headers. This will make
our handling of MIME messages much more robust.
Reports: 695565 (biljir), 702215 (gdvieira), 702316 (spf)
8. Fixed bug preventing decoding when headers are in an unexpected order
Reports: 729551 (fibrizo)
9. Allow "+" to be submitted encoded in a form
Reports: 719989 (thedonga)
10. Fixed unrequested History deletion, history doesn't go back far enough
Reports: 703364 (besonen), 708387 (beej) (and others)
11. Fixed XPL problems with filtered history view
Reports: 697046 (helphand), 692673 (nobody)
12. Restored main history form to POST
Reports: 690451 (nobody)
13. Fixed reporting of remaining child pid's
Reports: none
14. Make hostname decoding case-insensitive
Reports: none
15. Fix odd return value from classify_file for some messages:
"unclassified"
Reports: none
THE GORY DETAILS FOR v0.19.0
All of the above plus... (sschinke had a hand in lots of this too)
1. Large overhaul of the magnets system so that we add Cc magnets, make
magnets editable from the Magnets page, and a new feature called
QuickMagnets which allows you to create magnets from received emails
by selecting parts of the From, To, Cc and Subject lines.
Requests: 676341 (reason1000)
2. When an email is clicked on to see the colored version a new page
appears with complete coloring (including for the pseudowords which
now have popups on the colored areas), a new full dump of the
probabilities used in the calculation of the message classification.
Requests: lost track
3. On the Buckets page there are new statistics counting the number of
errors made by bucket. For each bucket we have the number of false
positives (i.e. mails that went in the bucket that should not have)
and the number of false negatives (i.e. mails that should have been
in that bucket but were not).
Requests: 692600 (nobody),
In addition when clicking on an individual bucket to look at the words
in it a new page appears with a clickable index for speed and words
are sorted by frequency so that you can see the most important words
in each bucket.
Requests: 691386 (philiplaw)
From the bucket page it is possible to remove all the words in a bucket
so that you can start retraining a bucket without having to delete
and reinsert it.
Requests: 675983 (nobody)
4. The Configuration page is now pluggable so that as new modules are
written their elements appear in the configuration page automatically.
Requests: none
There is a link to the current log file so that you can access it
from within the browser.
Requests:
5. The Security page is now pluggable so that as new modules are
written their elements appear in the configuration page automatically.
Requests: none
6. The Advanced tab no longer uses a hard coded set of ignored words,
they are now stored in a file and can be edited through the UI.
Requests: lost track
7. POPFile's internal structure has undergone a large change to make
use of Perl's object oriented features and there is now a common
base class for all POPFile modules (called POPFile::Module) and as
a result many of the POPFile modules have been simplified greatly.
An offshoot of this is that there are a number of experimental modules
that you can obtain only if you get the CVS version:
SMTP.pm - SMTP proxy that does SMTP mail classification
NNTP.pm - NNTP proxy that does NNTP Usenet news post
classification
XMLRPC.pm - Full access to POPFile's API (see Classifier::Bayes)
through XML-RPC
The XMLRPC available methods are (documentation in Classifier::Bayes):
classify_file
classify_and_modify
get_buckets
get_bucket_word_count
get_bucket_word_list
get_word_count
get_bucket_unique_count
get_bucket_color
set_bucket_color
get_bucket_parameter
set_bucket_parameter
get_html_colored_message
create_bucket
delete_bucket
rename_bucket
add_message_to_bucket
remove_message_from_bucket
get_buckets_with_magnets
get_magnet_types_in_bucket
clear_bucket
clear_magnets
get_magnets
create_magnet
get_magnet_types
delete_magnet
get_stop_word_list
add_stopword
remove_stopword
Requests: none
8. Numbers are now legal in bucket names.
Requests: 720681 (deaper)
9. The characters used around the classification in subject modification
can now be reconfigured with options -bayes_subject_mod_left and
-bayes_subject_mod_right and they default to [ and ]
Requests: 722837 (stanleyspanner)
10. All command line parameters have changed name. The old names all work
correctly and are upgraded automatically,
Old Parameter New Parameter
corpus bayes_corpus
unclassified_probability bayes_unclassified_probability
piddir config_piddir
debug GLOBAL_debug
ecount GLOBAL_ecount
mcount GLOBAL_mcount
msgdir GLOBAL_msgdir
subject GLOBAL_subject
imeout GLOBAL_timeout
xpl GLOBAL_xpl
xtc GLOBAL_xtc
download_count GLOBAL_download_count
logdir logger_logdir
localpop pop3_local
port pop3_port
sport pop3_secure_port
server pop3_secure_server
separator pop3_separator
toptoo pop3_toptoo
archive html_archive
archive_classes html_archive_classes
archive_dir html_archive_dir
history_days html_history_days
language html_language
last_reset html_last_reset
last_update_check html_last_update_check
localui html_local
page_size html_page_size
password html_password
send_stats html_send_stats
skin html_skin
test_language html_test_language
update_check html_update_check
Requests: none
11. There is now a Reclassify button at the top of the History page
as well as at the bottom.
Requests: 727834 (garowetz), 687138 (johnmccurdy)
12. If you do a search that results in no results there is now a search box
for you to search again.
Requests: 685327 (tomvoss)
13. The Windows installer now guides you in the creation of an initial set
of buckets.
Requests: 691348 (pkarlin)
14. Each time the history cache is reloaded we insert a marker line in the
UI,
these markers delimit the times the user reloaded the history with new
messages
and provide a useful marker for using POPFile throughout the day.
Everything
before the last marker is newly received.
Requests: none
15. POPFile now has 19 UI translations available in the following languages:
Bulgarian
Chinese (simplified)
Chinese (traditional)
Danish
Dutch
English
English (UK)
Finnish
French
German
Hungarian
Korean
Norwegian
Portugese (Brazilian)
Russian
Slovak
Spanish
Swedish
Ukrainian (http://zope.net.ua/POPFile)
Requests: none
16. To help in the fight against spam a number of new pseudowords have been
added to track spams that load images from across the web and web bugs,
the
complete list of pseudowords is now:
encoding:<mime encoding type>
header:<email header>
html:numericentity
html:td
html:imgremotesrc
html:imgwidth<pixels>
html:imgheight<pixels>
html:fontsize<size>
html:encodedurl
html:comment
html:authorization
trick:spaceout
trick:dottedwords
trick:invisibleink
Requests: none
17. Numerous modifications by Sam and I to the mail parser, MIME handling,
HTML
parser, colorizer, Base 64 and Quoted Printable decoder to improve
accuracy.
Requests: lost track
18. Windows users will discover that POPFile adds an icon to the system tray
with a popup menu (right click) that offers to take you to the POPFile
UI
or to shutdown POPFile. A double click on the icon navigates to the
POPFile
UI.
Requests: lost track
19. Sorting of columns in the History can be set to ascending or descending
by clicking on the column header. There's a new indicator that tells
you
which column is sorted and in what order.
Requests: none
20. Fixed security problem with the X-POPFile-Link functionality and
passwords.
Requests: none
21. Added new view of the last 10 items in the log on the configuration
page.
Requests: none
22. Added code to detect problem entries in the corpus when loading and
remove
them from the in memory copy. Currently only detects entries with
multiple
spaces.
Requests: none
DONATIONS
Thank you to everyone who has clicked the Donate! button and donated their
hard earned cash to me in support of POPFile. Thank you also to the people
who have contributed patches, feature requests, bug reports and
translations.
http://sourceforge.net/forum/forum.php?forum_id=213876
CONCLUSION
Keep the ideas and bug reports coming.
John.
|