Re: [fetchmail-devel] Tracking pending POP3 deletes to solve some problems

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Matthias,

Quoting from Matthias Andree's mail on Wed, Jul 22, 2009:
> There seems to be a general issue with fetchmail around fetching large  
> messages, spam filtering, timeouts and thereabouts.
> 
> 
> Some three years ago I made a foray into tracking pending DELEtes that we  
> could no longer issue or that were reverted by POP3 servers through socket  
> errors (often timeouts or network failures, not too clear to me at the  
> time), for reference, see
> 
>   - https://lists.berlios.de/pipermail/fetchmail-users/2006-May/000409.html
> 
> and followups. At the time, we discussed several concerns around this  
> issue and I eventually shelved a solution for lack of ideas and time.

...

> It seems the problem has two faces:
> 
> 1. Timeout on the server we're fetching from, because the SMTP server or  
> MDA takes too long.

There are actually two possibilities where the delay could be
occurring.

====================================================================
Case 1: After fetchmail has sent the initial MAIL FROM: to the SMTP
server/invoked the MDA with %F substitution:

I am assuming here that the SMTP server/MDA is doing a DNS
lookup/basic spam testing on the sender address. 

So, in the SMTP server case, fetchmail is waiting for a response from
the server after sending the initial MAIL FROM: while the SMTP server
is doing the DNS lookup/basic spam testing.

In the MDA case, fetchmail goes ahead with popen() and starts writing
the body to the MDA. As the MDA is still doing the DNS lookup/basic
spam testing and has not yet started reading the body, the pipe buffer
will get full and fetchmail will get blocked on an fwrite() later.

While fetchmail is blocked, the POP3 mailserver is waiting for
fetchmail to read the entire mail body.

Note that the write() timeout for the POP3 mailserver may be shorter
than the read() timeout. This means that the POP3 mailserver is more
likely to timeout faster if it finds that the body is not getting
drained at all in a reasonable amount of time.
====================================================================
Case 2: After fetchmail has sent the entire mail to the SMTP server/MDA:

Here, the remote mailserver is waiting for a command from fetchmail,
while fetchmail is waiting for a response from the SMTP server/exit
code from the MDA.

As mentioned above, the read() timeout may be longer for the POP3
mailserver and so it may not mind waiting for the next command.
====================================================================

Of course, a combination of the above two cases is also possible.

> Generally, I see several approaches to 1:
> 
> a. queue downloaded messages before handing them off for delivery. This  
> avoids timeouts that originate in the SMTP/LMTP server or MDA that  
> fetchmail forwards to.

This should work. Of course, fetchmail will have to work entirely with
UIDs as it will have to reconnect later and mark delivered mails for
deletion.

> b. Alternatively, we could try making fetchmail multithreaded and keeping  
> the POP3 server happy by spamming it with NOOP. I'm not sure how good this  
> works, how many POP3 servers implement NOOP, how many NOOP in sequence  
> they tolerate. Given fetchmail's design, it's very intrusive and amounts  
> to a rewrite of several major parts. It would have other benefits, but  
> it's a major effort.

This will not work in Case 1. There, the POP3 mailserver is obviously
in no mood for NOOPs and may even treat it as a protocol error if it
gets a command even though the complete body has not been sent.

> c. Alternatively, we could try to reconnect after loss of connection -  
> however, we may lose prior DELE commands when we don't send QUIT, so  
> again, we need to bundle DELE requests at the end or for a separate  
> transaction.  Given that many sites (including hotmail, where Tony had his  
> problem) limit the number of logins per unit of time, often to once per 15  
> minutes, we can't preventively send QUIT so as not to lock ourselves out.  
> Anyways, the solution means we would do 2.

In Case 1 above, the POP3 mailserver may have timed out before sending
the entire body. If this is happening repeatedly, fetchmail will
always fail on the same mail.

> Fixing 2 is sort of a requisite for solving 1 in way a or c - we need to  
> track more state. This does entail changing the .fetchids format as  
> discussed in 2006, but the UID parser appeared very tolerant even at that  
> time, so that an extension would be possible and backwards compatible. I  
> would feel more comfortable checking that again, but I think I checked  
> thoroughly in 2006 already. Even if we must change the .fetchids  
> format/layout, I'm open to it.

Well, changing the .fetchids format is anyway a must. If you can
incorporate the UID parser, it will be great. If I remember correctly,
the UID parser also had an option to mark bad mails. This would be
used in such cases where there is a repeated delivery failure on the
same mail. Once a certain bad count is reached, fetchmail will stop
attempting to download the mail.

> Functionally, we'd probably need to bundle DELEs into a bulk operation of  
> "DELE n1 DELE n2 DELE n3 ... DELE nm QUIT" so that we have a reasonable  
> chance that the server isn't going away from boredom between the first  
> DELE and the QUIT, and we have more chances to avoid UID reassignment and  
> "delete wrong message" issues that happen in the race Sunil described, i.  
> e. if the network dies if the server executes QUIT but fetchmail doesn't  
> see the +OK response.

This should be possible.

I have not gone through the cases you have mentioned yet, but it would
be better to categorize them as Case 1 or Case 2 (or both!) first
before deciding the course of action. For SMTP server, it will be
simple as the time between the SMTP transactions will give a clear
indication in syslog. For MDA, this will probably require an strace
output.

-- 
Sunil Shetye.

Re: [fetchmail-devel] Tracking pending POP3 deletes to solve some problems

Client daemon to move mail from POP and IMAP to your local computer

Re: [fetchmail-devel] Tracking pending POP3 deletes to solve some problems