From: Brendan L. <bre...@ai...> - 2006-03-02 18:25:18
|
I've seen the problem with a couple of upstream servers: I'll get you the id of my current server when I get home. However I believe the servers is behaving correctly according to the IMAP spec: I have attached an explanation of the secondary problem I am seeing (extra 28 s wait) that I sent to Casper; let me know if you need more info. I've put comments to your comments below. Brendan mat...@gm... wrote: >Brendan Lynch <bre...@ai...> writes: > > > >>This code works fine until a notification of new mail is received (a "* >>1 EXISTS" message is received). At this point normally one is in the >>"imap_ok()" routine called from line 637, and this correctly receives >>the notification message and parses it, updating the "count" variable. >>However it does not the return to imap_idle() routine, but instead >>reissues a recv call having set a much longer (300s) timeout and after >>the 300s have expired it then returns an error causing fetchmail to drop >>the IMAP connection (a main point of this idle code was to keep the >>connection open for subsequent message retrieval). Net result is that >>delivery of mail is delayed by 5 minutes and the server connection is >>dropped and reestablished each time a wait for mail occurs. >> >>The problem seems to be caused by the loop condition in imap_ok(): >> >> 151 } >> 152 } while >> 153 (tag[0] != '\0' && strncmp(buf, tag, strlen(tag))); >> 154 >> >>This assumes that tag[0] will be set to '\0' if one is not waiting for a >>tagged response. In this case the code should not be waiting for a >>tagged response (it is waiting for an unsolicited notification). >>However the 'tag' global character array is set by the gen_transact() at >>line 630 and is not cleared before the call to imap_ok() at line 637. >> >> > >This is correct, an IMAP client is supposed to parse untagged responses >until a tagged response is received. Trying with Dovecot and hacking >fetchmail a bit so it doesn't recognize RECENT and uses the NOOP >emulation code yields: > >... >fetchmail: IMAP> A0010 NOOP >fetchmail: IMAP< A0010 OK NOOP completed. >fetchmail: IMAP> A0011 NOOP >fetchmail: IMAP< * 1 EXISTS >fetchmail: IMAP< * 1 RECENT >fetchmail: IMAP< A0011 OK NOOP completed. >fetchmail: IMAP> A0012 NOOP >... > >So it waits for the tagged NOOP response, and this is a requirement so >it actually picks up both the EXISTS and the RECENT responses of working >servers. Servers that do not respond with a tagged response to a NOOP >command are broken. > > See my explanation to Casper. The key point is that the imap_idle() code when 'faking' IDLE support calls imap_ok() (at line 637) without having issued any tagged command: 625 /* when faking an idle, we can't assume the server will 626 * send us the new messages out of the blue (RFC2060); 627 * this timeout is potentially the delay before we notice 628 * new mail (can be small since NOOP checking is cheap) */ 629 mytimeout = 28; 630 ok = gen_transact(sock, "NOOP"); 631 /* if there's an error (not likely) or we just found mail (stage 632 * has changed, timeout has also been restored), we're done */ 633 if (ok != 0 || stage != STAGE_IDLE) 634 return(ok); 635 636 /* wait (briefly) for an unsolicited status update */ 637 ok = imap_ok(sock, NULL); 638 /* again, this is new mail or an error */ 639 if (ok != PS_IDLETIMEOUT) 640 return(ok); Since there is no tagged command outstanding the server will not (and should not) ever send a tagged response. If we set imap_ok to only return after a tagged response we are guaranteeing it will always only return after a timeout. Worse, if the server does send an unsolicited EXISTS message during this (28s) imap_ok() wait, there is already-existing special-case code to handle an unsolicited EXITS message in imap_ok: 93 /* 94 * Nasty kluge to handle RFC2177 IDLE. If we know we're idling 95 * we can't wait for the tag matching the IDLE; we have to tell the 96 * server the IDLE is finished by shipping back a DONE when we 97 * see an EXISTS. Only after that will a tagged response be 98 * shipped. The idling flag also gets cleared on a timeout. 99 */ 100 if (stage == STAGE_IDLE) 101 { 102 /* If IDLE isn't supported, we were only sending NOOPs anyway. */ 103 if (has_idle) 104 { 105 /* we do our own write and report here to disable tagging */ 106 SockWrite(sock, "DONE\r\n", 6); 107 if (outlevel >= O_MONITOR) 108 report(stdout, "IMAP> DONE\n"); 109 } 110 111 mytimeout = saved_timeout; 112 stage = STAGE_FETCH; 113 } The last 2 lines are executed even for the NOOP-emulated idle. This causes us to change the timeout to 5 minutes and set stage to STAGE_FETCH. Then, if we are waiting for a tagged response, we loop around and wait for 5 minutes for a tagged response (which we will never see) and eventually return PS_SOCKET (since our wait times out and we are not in STAGE_IDLE) causing fetchmail to drop and reestablish the mail server connection. > > >>The fix is a very simple one-line change to imap_idle (2 lines with >>comments): >> >> 630 ok = gen_transact(sock, "NOOP"); >> 631 /* if there's an error (not likely) or we just found mail >>(stage >> 632 * has changed, timeout has also been restored), we're >>done */ >> 633 if (ok != 0 || stage != STAGE_IDLE) >> 634 return(ok); >> 635 >>+ 636 /* clear tag so imap_ok does not expect tagged response */ >>+ 637 tag[0]='\0'; >> 638 /* wait (briefly) for an unsolicited status update */ >> 639 ok = imap_ok(sock, NULL); >> >> > >So this patch would sort of break the IMAP client because it would jump >out of the loop before having read the reply. This requires more >thought. Casper Gripenberg reported a similar problem, so perhaps some >common upstream server software is the culprit (and he suggested a >different fix, I'll have a look at that too). > > No, the client *is not* waiting for a reply; the reply was received and processed in the gen_transact() call at line 630; this deals with an imap_ok wait where no command has been issued. >What software is your upstream server running? > >Can I see a "fetchmail -Nvvv --nosyslog" trace of a failing IMAP session >with NOOP emulation? > > I'll set this up this evening and send you the trace and a timed strace (which is much more useful). My mail to Caper includes an strace where the 'exit on unsolicited notification during imap_idle fix is in place, but the second 'RECENT' modification is not. > > >>A second, more minor problem is that getting a "* RECENT" notification >>does not break a caller out of the imap_idle's imap_ok() call. This >>causes an extra 28second wait after being notified about a message >>before it is actually received. >> >>Diffs for the complete set of changes against 6.3.2 are attached to this >>email. >> >>I have seen this in fetchmail 6.2.5 and 6.3.2 on linux platforms, but >>this problem should be generic to all platforms. >>diff -Naur fetchmail-6.3.2/imap.c fetchmail-6.3.2a/imap.c >>--- fetchmail-6.3.2/imap.c 2006-01-20 10:38:45.000000000 +0000 >>+++ fetchmail-6.3.2a/imap.c 2006-02-23 23:54:52.000000000 +0000 >>@@ -116,6 +116,17 @@ >> { >> recentcount_ok = 1; >> recentcount = atoi(buf+2); >>+ /* >>+ * Kluge to handle IDLE simulation. If we are in STAGE_IDLE and >>+ * we are simulating (has_idle unset) we need to alert calling >>+ * imap_idle() to the fact that we have received a new "recent" >>+ * alert in order to break the imap_idle() out of its wait. >>+ */ >>+ if (!has_idle && stage == STAGE_IDLE) >>+ { >>+ mytimeout = saved_timeout; >>+ stage = STAGE_FETCH; >>+ } >> } >> else if (strstr(buf, " EXPUNGE")) >> { >> >> > >This looks acceptable. > >How can a server mark a new message "RECENT" without also sending an >"EXISTS"? I'm inclined to consider such behavior broken. I'm willing to >apply this nonetheless as I don't think it would hurt anyone. > > > It cannot, but the order it does them depends on the server; and also the server may send an EXISTS message without a RECENT message if a message is expunged. the 'recentcount' is what is used to break us out of one of the loops; see my explanation to Casper. Brendan |