Menu

Hack to bulk add Windows Address Book entries

Extensions
Wes Cherry
2003-04-18
2003-11-30
  • Wes Cherry

    Wes Cherry - 2003-04-18

    Hey Popfilees

    I wrote up a quick and dirty C++ program to mass add Windows Address Book entries as Magnets in Popfile.    Is anyone else interested in it?  It's thrown together and doesn't have much error checking.   If there's more interest I can clean it up for a general release.

    -Wes
    wesc <AT> technosis (DOT) com

     
    • William Jacoby

      William Jacoby - 2003-04-18

      You wouldn't happen to know perl, it would definately be better integrated and possibly become part of Popfile by default if it were in perl.

       
      • Mark Craig

        Mark Craig - 2003-04-19

        No, I don't think it would, because IMHO you're both missing the point of PopFile again: if you use magnets for known-good messages, you risk denying the "good" corpus of words that might be needed to help properly classify other unknown-good messages.  My understanding is that magnets should primarily be used only for peculiar classes of messages (such as mailing lists & newsletters?) that might otherwise muddy the corpii.

        What wesc is trying to do is use magnets to add a whitelisting feature to PopFile, which is probably contrary to its proper functioning.  If you want a whitelisting service, subscibe to Mailblocks.com!

         
        • Wes Cherry

          Wes Cherry - 2003-04-19

          Hmm,  well, for me, i want to special case mail from my friends, they are the highest pri for me to read (i filter to a bucket named wesc). -- I hate it when popfile misclassifies a friend email as spam/lopri or other list classifications i use.   Could popfile be modified to still add magnetted emails to the corpus?   Or, even better, why not add white-listing as a popfile feature?   
          The code to traverse a windows WAB is pretty simple (at least in C++ -- I only know a smidgen of PERL)

           
        • Jeff Russell

          Jeff Russell - 2003-05-09

          Maybe I'm wrong, but it seems that POPfile could analyze the words from magnetized emails either way. If the user /knows/ an email is good, then it should be up to POPfile to allow such a whitelist without an impact on performance.

           
        • Patrick Müller

          Patrick Müller - 2003-10-31

          >if you use magnets for known-good messages, you risk
          >denying the "good" corpus of words that might be
          >needed to help properly classify other unknown-good
          >messages.

          yep, exactly.
          And that's why i think, a little button at each magnet, that the concerned email should be examined and the words added to the corpus

          What do you think?

           
          • Pedro Santelmo

            Pedro Santelmo - 2003-11-30

            Well it has happened to me that I have received spam emails with my own email address as the sender.
            If I had put my email address or any other in a white list with a magnet, I wouldn't get it detected as spam. That would make it all work wrong.
            But, there is an alternative that you might look for here: http://www.geocities.com/helphand1/popfile.htm . It is called Clean Corpus and might slightly affect the words that decide that something is spam.
            It is so far away from white lists that it looks like an alternative to be improved with your needs. But it is required like this to keep popfile working like it was thought.
            There are other alternatives for spam filtering that do white lists but with the sender's IP address. The email address is very easy to fake.

             
    • John Graham-Cumming

      POPFile does have a whitelist feature: that's one of the uses of magnets (as you have discovered).

      John.

       
    • Anonymous

      Anonymous - 2003-06-23

      Importing address books is a great idea.  What is really needed is a tool for analyzing old email boxes to write the white list. 

      Datamining the email header emails, can show the people that you have email dialog with.  These first degree people and all the other people that they also email at the same time with the CC, BCC, and TO headers would be whitelisted as first and second degree people.  These emails addresses can be used quickly determine real emails.  All people that you have sent email to, would also be whitelisted.

      With a separate file of known junk email, distributed with POPFile, and these identified real peoples emails, one could train POPFiles bayesian filter. 

      Emails addresses that you only receive emails from would have to be evaluated as spam or real emails.  Many relationships that you want or need email from can be whitelisted as well. 

      It would be nice to populate these people into a white list, and into ones address book.
      I have been dabbling with perl code to contemplate such a process.
      How do I output POPFile magnet rules from such a datamining tool?

      -Nathaniel
      npendleton (at mark here} pendleton press {dotthingy] com.

       
      • Scott Leighton

        Scott Leighton - 2003-06-23

        >I have been dabbling with perl code to contemplate such a process.
        >How do I output POPFile magnet rules from such a datamining tool?

          The API has a create_magnet function in Classifier::Bayes. You pass it the bucket, type of magnet, and text of magnet to create the magnet. Should be a piece of cake with a little Perl ;)

          Scott

         

Log in to post a comment.

MongoDB Logo MongoDB