Menu

#273 Problems moving messages between pst and Exchange

open
Outlook (523)
5
2005-01-21
2003-08-29
Tony Meyer
No

If a message's clues are viewed when on the Exchange
server, and compared to the same message moved to a
pst file, the clues are not the same. It appears (I
haven't examined closely yet; can do on request) that
on Exchange the html part of the message is used, and
in the pst, it isnt'.

Probably related to this is the problem that moving a
message back and forwards between Exchange and a
pst file (showing clues each time) results in an ever-
increasing number of tokens.

It doesn't appear to be the PR_SEARCH_KEY changing:

>>> key1 = "PR_SEARCH_KEY : '\n\x02\xde\xfd7\xf6
\xa7A\x93\xfd\xf3\xb1\xfeA\x16\xf9'"
>>> key2 = "PR_SEARCH_KEY : '\n\x02\xde\xfd7\xf6
\xa7A\x93\xfd\xf3\xb1\xfeA\x16\xf9'"
>>> key1 == key2
True

Next thing to try? :)

Discussion

  • Tony Meyer

    Tony Meyer - 2003-08-30

    dump_props on a message in a pst.

     
  • Tony Meyer

    Tony Meyer - 2003-08-30

    Logged In: YES
    user_id=552329

    The dump_props are attached.

    If I just move the messages about, doing 'show clues', then
    no training takes place. I think my original comment was
    wrong - trying now, I get the same number of tokens no
    matter how many times I move (although the exchange count
    and pst count are different). Anyway, the log (at verbose=1)
    doesn't show anything apart from the "already trained as
    ham" message.

    If I train a message I get not that much more. pst first:
    """
    Training on message 'Re: comparing 2 images' - trained as
    spam
    Saving bayes database with 4637 spam and 410 good
    messages
    -> C:\Documents and Settings\tameyer.MASSEY\Application
    Data\SpamBayes\default_bayes_database.db
    -> C:\Documents and Settings\tameyer.MASSEY\Application
    Data\SpamBayes\default_message_database.db
    Saved databases in 896.138ms
    """
    and moving it back to Exchange:
    """
    Training on message 'Re: comparing 2 images' - trained as
    good
    Saving bayes database with 4636 spam and 411 good
    messages
    -> C:\Documents and Settings\tameyer.MASSEY\Application
    Data\SpamBayes\default_bayes_database.db
    -> C:\Documents and Settings\tameyer.MASSEY\Application
    Data\SpamBayes\default_message_database.db
    Saved databases in 850.026ms
    """

    Does this help?

     
  • Mark Hammond

    Mark Hammond - 2003-08-31

    Logged In: YES
    user_id=14198

    The underlying bug seems to be
    https://sourceforge.net/tracker/index.php?func=detail&aid=798029&group_id=61702&atid=498103
    - however, as it looks like we will be almost
    "hand-crafting" the HTML of the message, I will leave this
    open, as we may still end up with bugs if the html we
    generate isn't identical (token-wise) to the MS one.

     
  • Tony Meyer

    Tony Meyer - 2003-09-06

    Logged In: YES
    user_id=552329

    Are we going to be able to get identical token streams?
    Attached are two 'show clues' messages, for the same
    message, on a pst and on Exchange. 26 clues for one, and
    28 for the other. This is a plain text message.

    The extra two clues arise because Exchange html'ises the
    plain text message and so the words in the subject also
    appear in the body.

     
  • Tony Meyer

    Tony Meyer - 2003-09-06

    Plain text message on Exchange

     
  • Tony Meyer

    Tony Meyer - 2003-09-06

    Same message in the pst (no training was done).

     
  • Tony Meyer

    Tony Meyer - 2005-01-21
    • assigned_to: mhammond --> anadelonbrin
     

Log in to post a comment.

MongoDB Logo MongoDB