From: Matthias A. <mat...@gm...> - 2021-01-29 00:04:26
|
Am 28.01.21 um 09:00 schrieb Julian Bane via Fetchmail-users: > I have exactly the same results on fetchmail 6.4.15 - openSUSE > Tumbleweed current release which reports: > > "This is fetchmail release > 6.4.15+POP2+GSS+RPA+NTLM+SDPS+SSL-SSLv2-SSLv3+OPIE+NLS+KRB5." > > > On 27/01/2021 13:41, Julian Bane via Fetchmail-users wrote: >> I have been running a test using fetchmail (6.3.26) on an openSUSE >> Leap 15.2 server to retrieve email from a local email account. That >> account has been sent 5 seperate emails. Three of the emails are to >> single addresses, one is to two addresses and another is to three >> addresses. As expected there are 8 separate emails in the local mail >> spool file from the 5 I sent. As a test I am using the fetchmailrc file: >> >> set postmaster root >> poll cumulus protocol POP3 local mydomain.net envelope >> "X-Original-To:" >> username "user" password "pass" to * here fetchall keep >> mda "echo 'Hello World - %T - %F'" >> >> The mda is just a dummy to see what is going on when I run it from >> the command line. >> >> It retrieves all 8 emails correctly identifying the origonal >> recipient and %T is substituted correctly for the different email >> adresses. >> Doing "fetchmail -V -f fetchmailrc" confirms that it is running in >> multidrop mode. >> >> If I substitute 'myname' for the '*' (asterisk) in fetchmailrc, >> fetchmail goes into singledrop mode as you would expect (checked with >> the -V option) but also retrieves all 8 emails. >> >> I am not sure which way round it should be, multidrop or singledrop, >> but I do want to remove duplicates and so want to retrieve just 5 >> emails from the spool file (have been reading "THE USE AND ABUSE OF >> MULTIDROP MAILBOXES" on the man page). I have checked that the >> message-IDs are the same and there are multiple recipients on the >> 'To:' header of the relevant emails. I also have a copy of the spool >> file if it would help. >> >> Any advice or insight would be most welcome. Greetings, So after reviewing 6.4.15 logs that Julian sent to me off-list, some remarks: * The server in the test scenario contains messages with these Message-ID shown below. Meaning that the duplication already happened on the server-side mailbox: Message-ID: <16116718 Message-ID: <6ec00e74 Message-ID: <bbe93702 Message-ID: <f1711007 Message-ID: <f1711007 Message-ID: <f1711007 Message-ID: <f30f6378 Message-ID: <7c6c3256 Message-ID: <7c6c3256 * fetchmail works as I believe it should: the server's mailbox in Julian's test scenario contains eight messages and fetchmail obtains and delivers eight. So fetchmail is not duplicating messages. Neither is it deduplicating, see next: * fetchmail has some deduplication code, but it is for a very specific and historic purpose: if the most-dangerous "To:/Cc:" guessing is being applied for lack of a header that carries a copy of the envelope's (SMTP dialogue's) destination address (Julian is using X-Original-To), then deduplication is supposed to kick in. The dedup'ing code compares the ENTIRE header (by way of calculating its MD5 hash value) in a run of message and suppresses all copies but the first. This code however is not supposed to work and also technically can not work in Julian's scenario: the headers of the server-side messages are *not exactly* the same, and differ in the X-Original-To: headers. Enhancing this deduplication code would mean changing behaviour and hence should not happen before fetchmail 7.0.0. An idea is to filter out a certain list of headers (among which the envelope's destination address recording headers) and only then hashing what is left of the header. But given the proposal below I'd say we don't actually need to change fetchmail: * Possible solution: install the standalone "maildrop" package from the Courier mailserver suite and use it, or a wrapper script for it, as mda. Maildrop ships with a "reformail" tool that can track Message-IDs of seen messages in a database and its exit status can be used to suppress these duplicates by message ID (it's option -D). The maildropex(7) manual page contains an example, just look for the first occurrence of: reformail -D https://www.courier-mta.org/maildrop/maildrop.html Note I strongly discourage using procmail. It has been unmaintained for like two decades and is inherently unsafe: All error handling and locking needs to be put in your recipes explicitly to achieve deterministic behavior in adverse conditions (such as under load). maildrop is safer to use and its .mailfilter files are easier to understand and write, and it is well-behaved by default, unlike procmail. HTH Happy fetching & deduplicating, Matthias |