From: Seth M. <se...@ro...> - 2009-01-23 19:17:17
|
Matthias Andree wrote: > Seth Mattinen schrieb am 2009-01-19: > >> I was working on developing fetchmail for use in a large multiuser >> environment and came across some issues with the POP3 UID lists. I'm >> very new to fetchmail's code, so I don't know of these things are >> intentional or expected behavior, but they seemed "broken" to me. > > Hi Seth, > > you're scratching a sore spot ;-) Except for minimal streamlining and > minor fixes, the uid.c code is what we inherited from Eric S. Raymond, > who was very loathe to touch anything of it since it broke all too > easily. > > A while back, Sunil and I had exchanged patches to save UIDs in one file > per account, but I didn't merge that code into fetchmail, to make > fetchmail 6.3.X as drop-in compatible with 6.2.5 so that nobody would > have excuses for not upgrading from 6.2.X to 6.3.X. > > Let me beforehand state that the data structures for storing the UID > lists are very inefficient, and costly at run-time. Several hundred kept > messages bog down old computers, and several thousands bog down modern > computers. We have cascaded linear lists, and we store the file very > inefficiently on-disk (which is less of an issue today). > > The whole UID issue is for reconsideration when I really start working > on 6.4.X. At least, we need efficient memory and on-disk structures and > we'd preferably save to one uid file per account, so we can do away with > all the "load unrelated host info to scratchlist" stuff. It may be > useful to store everything in some lightweight/embedded database instead > (such as TokyoCabinet, Berkeley DB, SQLite). No decisions made yet, and > maybe we need to benchmark our options here during development. Something other than a linear list written to disk on every iteration would obviously be better. Personally I wouldn't worry about backwards compatibility. If it were me, I probably wouldn't bother with loading anything into memory and just do queries against the database. It'll end up in the disk cache anyway. >> 1) The old UID list is never loaded when the daemon starts or restarts >> and the whole POP3 box downloaded every time for "keep" sources. My fix >> was to comment out the break after save_str() in uid.c because as far as >> I can tell, having that break in there causes it to never load the id >> file and it'll self-break when the for loop hits a null anyway. > > I do not observe such behaviour - which version of fetchmail are you > looking at? > > As far as I can see (it's around line 218 in my current uid.c version as > of post-6.3.9 SVN, in initialize_saved_lists()), it just breaks out of > "find the correct list" (there's a for loop three lines above) AFTER it > has saved the UID. > > So if commenting out that break; at line 219 fixes anything for you, > your fix is for the symptoms, but not the root cause. We should then > find the latter. I was using the released 6.3.9 version, not the bleeding edge from SVN since I'm not going for a development environment. The break seemed to be causing it to exit the outermost loop though and neither oldlist nor scratchlist were being populated at start time when a uidl file was present. >> 2) Everything is duplicated in the scratchlist even for UID entries we >> should know about so I added a boolean flag to skips duplicates. >> Otherwise when the UID file is written it'll include oldlist plus >> duplicates in scratchlist. Is it supposed to duplicate everything into >> the scratchlist for some reason? This isn't a functional problem, but it >> does waste memory and disk space if you're thinking about large deployments. > > This appears to be a side effect of your commenting out the "break;", > since the for loop you made complete now terminates with uid == NULL. It is since according to the debug output oldlist was never populated by initialize_saved_lists(), but without the break they both ran. > Make sure the spelling on the command line matches the one in the rcfile > exactly (including case) for a test and see if that helps. > > Could you show me your configuration file (remove/mask passwords!) and > your command line? Please do not edit host names, they are crucial here. > If you're concerned about disclosing that in public, send the material > to me GnuPG encrypted off-list (my key is 0x052e7d95), but please again > without passwords. > This is the specific fetchmailrc file that it's currently running with: set daemon 60 set no syslog set logfile /var/lib/fetchmail/test/log set idfile /var/lib/fetchmail/test/uidl set no bouncemail set spambounce set postmaster "fet...@bo..." defaults pass8bits smtphost mail2.rollernet.us,mail.rollernet.us antispam 554 skip mail.mattinen.org with proto imap and tracepolls user "sethm" pass "xxxxxxxxxxxxx" smtpname "se...@ro..." ssl poll mail.mattinen.org proto pop3 uidl tracepolls user "sethm" pass "xxxxxxxxxxxxx" smtpname "se...@ro..." ssl keep fastuidl 4 su fetchmail -c \ 'fetchmail --quit \ -f /var/lib/fetchmail/test/fetchmailrc \ --pidfile /var/lib/fetchmail/test/fetchmail.pid' And one more thing: 3) If two servers of the same name were present but one was "skip" and the other "poll", every "skip" would cause the UID file to be written with duplicate data because the server name is in ctl more than once so each loop through ctl during the uidl file write duplicated data. Added cases to check the skip/uidl flags and ignore servers without those set. Here's my patch that I ended up with that fixed the issues I observed. I've been running it for several days without any side effects. The only changes are to uid.c. http://www.rollernet.us/opensource/patches/fetchmail-rollernet-uidfixes2.patch I haven't done extensive testing against it beyond the few cases where I had issues. -- Seth Mattinen se...@ro... Roller Network LLC |