Based on private communication with Bill Shannon
(creator of JavaMail), I've come up with a different
strategy to incrementally crawl an IMAP folder. This
strategy needs to be implemented:
- Retrieve all non-deleted messages using
Folder.search(new FlagTerm(Flags.Flag.DELETED, false)).
This is preferable over checking the deleted flag of
each mail individually as prefetching these flags takes
considerable time and using a search lets the *server*
execute this selection. Folder itself uses a naive,
client-side approach but that it overridden by a
server-based approach in IMAPFolder.
- Determine the subset of non-expunged messages (no
need to prefetch anything).
- Pre-fetch message UIDs for this set.
- When the UIDValidity of the folder has changed or was
not registered (= initial crawl), we need to crawl all
messages, else we try to incrementally crawl it:
- See if the set of retrieved message UIDs is equal to
the set of stored message UIDs (can be done by
calculating message URIs and looking them up in
AccessData). Also see if the set of subfolders is the same.
- Only report a new FolderDataObject when the set of
messages and/or subfolders has changed.
- When the set of stored messages UIDs is different,
process every message and see if it needs to be
reported as a new Message or not. Report all stored
message UIDs that are no longer part of this set as
Log in to post a comment.