[fetchmail-devel] Tracking pending POP3 deletes to solve some problems

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Sunil, dear list readers,

There seems to be a general issue with fetchmail around fetching large  
messages, spam filtering, timeouts and thereabouts.

Some three years ago I made a foray into tracking pending DELEtes that we  
could no longer issue or that were reverted by POP3 servers through socket  
errors (often timeouts or network failures, not too clear to me at the  
time), for reference, see

  - https://lists.berlios.de/pipermail/fetchmail-users/2006-May/000409.html

and followups. At the time, we discussed several concerns around this  
issue and I eventually shelved a solution for lack of ideas and time.

Originally reported by Frédéric Marchal around that time (2006 or 2005, I  
didn't check).

However, I have had at least two plausible reports (perhaps more that I  
don't currently recall or connect) since. In these reports, the local  
delivery took so long that the server disconnected fetchmail before it  
could even send the DELE, or fetchmail disconnected itself if the SMTP  
server took too long to actually accept a message.

One newer issue was filed to the Berlios bug tracker as #10972 by Viktor  
Binzberger in 2007.

The other was reported to the fetchmail-users@ list last Friday  
(2009-07-17) by Tony Morehen; I have received a full log off-list.

While Viktor's problem is worked around in fetchmail 6.3.10 through  
extending the SMTP client timeouts to the minimum RFC 5321 (SMTP,  
obsoleting RFC 2821 and RFC 821) recommended periods, this isn't going to  
help if the POP3 server drops the connection as early as in Tony's case -  
plus he's using an MDA anyways.

It seems the problem has two faces:

1. Timeout on the server we're fetching from, because the SMTP server or  
MDA takes too long.

2. Tracking what messages were successfully fetched and delivered, and  
which subset of these were flushed.

Generally, I see several approaches to 1:

a. queue downloaded messages before handing them off for delivery. This  
avoids timeouts that originate in the SMTP/LMTP server or MDA that  
fetchmail forwards to.

b. Alternatively, we could try making fetchmail multithreaded and keeping  
the POP3 server happy by spamming it with NOOP. I'm not sure how good this  
works, how many POP3 servers implement NOOP, how many NOOP in sequence  
they tolerate. Given fetchmail's design, it's very intrusive and amounts  
to a rewrite of several major parts. It would have other benefits, but  
it's a major effort.

c. Alternatively, we could try to reconnect after loss of connection -  
however, we may lose prior DELE commands when we don't send QUIT, so  
again, we need to bundle DELE requests at the end or for a separate  
transaction.  Given that many sites (including hotmail, where Tony had his  
problem) limit the number of logins per unit of time, often to once per 15  
minutes, we can't preventively send QUIT so as not to lock ourselves out.  
Anyways, the solution means we would do 2.

Another workaround is making sure that your MTA or MDA accepts messages  
quickly (a few seconds at most, as common in Postfix unless set to  
non-default pre-queue content inspection) and uses the spam filter  
off-line after accepting the message. I understand that it's not always  
the best solution, but since the POP3 or IMAP server has accepted the  
message already, we seriously can only either forward or drop it anyways -  
bouncing is dangerous, since the envelope sender might be forged; I think  
messages can only sensibly be rejected before being accepted by your  
respective ISP through SMTP.

We could add a trivial queueing agent separately and link it through the  
MDA option for early experiments (sort of nullmailer for inbound  
messages), but

Fixing 2 is sort of a requisite for solving 1 in way a or c - we need to  
track more state. This does entail changing the .fetchids format as  
discussed in 2006, but the UID parser appeared very tolerant even at that  
time, so that an extension would be possible and backwards compatible. I  
would feel more comfortable checking that again, but I think I checked  
thoroughly in 2006 already. Even if we must change the .fetchids  
format/layout, I'm open to it.

Functionally, we'd probably need to bundle DELEs into a bulk operation of  
"DELE n1 DELE n2 DELE n3 ... DELE nm QUIT" so that we have a reasonable  
chance that the server isn't going away from boredom between the first  
DELE and the QUIT, and we have more chances to avoid UID reassignment and  
"delete wrong message" issues that happen in the race Sunil described, i.  
e. if the network dies if the server executes QUIT but fetchmail doesn't  
see the +OK response.

Anybody care to share their thoughts?  I'm looking forward to them.

Best regards and thanks for your time.

-- 
Matthias Andree

[fetchmail-devel] Tracking pending POP3 deletes to solve some problems

Client daemon to move mail from POP and IMAP to your local computer

[fetchmail-devel] Tracking pending POP3 deletes to solve some problems