Menu

Minimizing whitelist growth

Help
Ward Clark
2007-02-28
2013-03-22
  • Ward Clark

    Ward Clark - 2007-02-28

    I've been running JunkMatcher 1.6.1 since August 06, and I finally hit the 200-line limit on the whitelist, causing python to consume available CPU cycles.

    As recommended, I manually trimmed my whitelist back to about 140 lines, manually adding numerous email address to my Address Book.  While I was performing this tedious task, it occurred to me that I might benefit from changing my Junk folder scanning process.

    Until now, I've been scanning my Junk folder, dragging legitimate messages into my Inbox, and reading them there.  I delete a good percentage of these messages after reading them.  As I understand it, dragging to my Inbox adds to JunkMatcher's whitelist.

    My hope is to find a technique that minimizes additions to the whitelist.  For example, while I'm scanning my Junk folder, I could open a legitimate message, add "sender" to my Address Book, and delete the message.  My hope is that this would (1) not touch the whitelist, and (2) cause future messages from "sender" to show up in my Inbox.

    I realize I could experiment to find out if this technique works.  However, most of my falsely classified junk are incidental messages from senders who won't send another message for weeks or months.

    Insight from experienced JunkMatcher user will be appreciated.

     
    • WitLi

      WitLi - 2007-08-15

      are you aware, that you can use brackets () and pipes | in order to do an "or" function with regex?

      so, if your white list is full, you COULD compact it tenfold or more by putting addresses in brackets:

      before:
      Tina@Tina.com
      Petra@Petra.com
      Peter@Peter.com

      now:
      (Tina@Tina.com|Petra@Petra.com|Peter@Peter.com)

      (yeah, some of those characters still have to be escaped)

      have fun!

      WitLi

       
    • Ross Barkman

      Ross Barkman - 2008-02-22

      I've set up groups, like so:

      A.com: (?i).+@(?:replies\.admiral|airberlin|airberlinmail|allume|apani|asiarooms)\.com
      B.com: (?i).+@(?:noreply\.bebo|contact\.britishairways|comms\.bt)\.com
      C.com: (?i).+@(?:cclondon|cd\-wow|computerweekly|cvent|cw)\.com

      That keeps the expression length shorter - there is a maximum length (can't remember what it is), so not repeating ".com" helps.

       

Log in to post a comment.