From: Uli Z. <ul...@ri...> - 2006-06-28 06:01:36
|
Am 28.06.2006 um 02:46 schrieb Matthias Andree: > While I'll certainly plug the leak (it may take until the week-end > though), there might still be a related MacOS X bug. There is no > obligation to call freeaddrinfo() before calling getaddrinfo() > again, plus this function is required to be thread-safe (i. e. it > needs to be > reentrant). Well, you *can* call getaddrinfo() a second time before calling freeaddrinfo() first in Mac OS X without a crash or anything, and it's thread-safe since Mac OS X 10.2 (though not before). The only thing is that if you call it again with exactly the same query data and without calling freeaddrinfo() in between, it will report the data it cached from the last call. You might call this a bug, but Apple most probably will call it a feature ... Note that on Mac OS X, getaddrinfo() is nothing more than a wrapper around the system's so- called "lookupd" daemon, which handles all network addressing issues with code that's completely different from all other Unix implementations. > However, you say that the problems happen after a certain... > >> In my test case, getaddrinfo() may need up to 180s to time out. >> However, I had set fetchmail's "timeout" parameter to only 60s. In >> my tests, during the time when the Internet connection was down, >> at least one "timeout after 60 seconds waiting to connect to >> server xy" did indeed appear in the log for each server. > > ...amount of time. If with "amount of time" you refer to the fact that the Internet must be down a certain amount of time for the issue to occur, this seems to be because the DNS data is cached for a certain amount of time, either by getaddrinfo() itself, or by Mac OS X's local caching BIND, or possibly by the caching name server of my router. Only when there's no cache anymore and getaddrinfo() (unsuccessfully) tries to retrieve that query data anew from the Internet, the timeout interruption will occur and prevent freeaddrinfo() from being called. At least that's how I interpret all these error lines in my logfiles. ;-) > Can you check with "lsof" or similar tools (that can list open > files and sockets) how many files and sockets fetchmail holds open > at the time when the problems start? It might be that the OS itself > is leaking sockets here which might appear in fetchmail's address > space. I don't think there's a problem, but I will test it anyway and report as soon as I'll find time for another forced network outage. >> So I'm quite sure that's the bug: Care must be taken that >> freeaddrinfo() is called even if SockOpen() is interrupted by a >> timeout. > > Certainly, and to avoid leaking memory on disconnected computers > would be reason enough to justify such a fix. Maybe it's a good idea to check for all occurrences of freeaddrinfo() whether they are certain to be performed whenever they should. Glimpsing at the code, e.g. in servport.c, line 65, the default switch condition is a goto jump that prevents freeaddrinfo() from being called, although it should be called (getaddrinfo() had returned 0/success before, so "res" had been allocated). It seems that calling freeaddrinfo() was not taken all too seriously throughout fetchmail. >> So you must either make ai0 a global variable (which won't work if >> you plan to make fetchmail open more than one socket >> simultaneously), or declare it in the calling code (driver.c or >> whatever) and pass it to SockOpen. > > "you'll have to" or "you may have to" sounds more polite than "you > must" (no offense taken, don't worry). Well, the force of the laws of logic isn't polite ... ;-) (This was not meant to be a social "must" but rather a logical one, as I indeed see no other programming techniques to deal with that issue.) > Bugs related to signal handling (which is used for timeout > handling) require extra care. I myself will have to review the code > again before making changes in that area. > [...] > I don't think I'll be able to handle this before Saturday, perhaps > Sunday; but providing patches for you to test should be feasible. Well, that's more than OK with me! If you deal with bugs in connection with Mac OS X, what you are afraid of are months or years until a fix is applied - days are no problem at all. :-) > Note that I don't have MacOS X machines to test on either. No problem, I can test this here. > Thank you! Well, thank you (in advance) for the fix! :-) Bye Uli ________________________________________________________ Uli Zappe, Solmsstraße 5, D-65189 Wiesbaden, Germany http://www.ritual.org Fon: +49-700-ULIZAPPE Fax: +49-700-ZAPPEFAX ________________________________________________________ |