From: Matthias A. <mat...@gm...> - 2010-08-27 19:13:18
|
Am 27.08.2010 17:29, schrieb Robert Linden: > Hello fetchmail-devel! > > I would like to get a confirmation that the following fetchmail-behaviour is > normal/the way it is implemented, or if I might be doing something wrong: Confirmed as suboptimal fetchmail -6.3.X behaviour, due to use of inadequate data structures, namely linked lists, as you've already figured out. > I noticed that fetchmail becomes very slow when I have had a lot of > mails downloaded, i.e. when the fetchid-file grows very large. > The numbers for one run (with no new mail) look roughly like this: > > 10 lines: 0.5 sec > 30.000 lines: 5.0 sec > 45.000 lines: 10.0 sec > 60.000 lines: 20.0 sec > 90.000 lines: 70.0 sec > > At first glance (at the code) it seems that the handling of this list is > suboptimal, with traversal of linked lists and stuff like that. Also it is > written [completely] a lot. > > Is this over-proportional time to be expected and has anybody seen it > too, or are long id-lists working fine for everone but me? Over-proportional time (over the number of messages or UIDs stored) is unavoidable the way fetchmail is currently designed, however we can do better than what we're doing currently. In fact, Rainer Weikusat has contributed new code that gets complexity down from O(n^2) to O(n log n) with P(atricia)-Tries. We don't get better than O(n log n) without major changes to the overall design, and I'm not sure going further is worth the effort. In practice however, O(n log n) is quite manageable, and it feels much faster and wastes much less CPU. While not exact and lacking constant factors and constant overhead, this gives a rough idea of how the current and new code effort would scale in a simulation. Left column is the number of messages kept, right two columns are computational complexity compared, roughly. n current new 10 100 33.2 30 900 147.2 100 10000 664.4 300 90000 2468.6 1000 1000000 9965.8 3000 9000000 34652.2 10000 100000000 132877.1 30000 900000000 446180.2 100000 10000000000 1660964 See the list archives: <http://lists.berlios.de/pipermail/fetchmail-devel/2010-June/001359.html> I plan to merge this for fetchmail 6.4 (which please consider a working name and not necessarily the final release version, I may call it 6.5 or 7.0 depending on other changes I make). > Apologies if this is the wrong list for such a question. No need to apologize, you've picked the right list. > PS: just FYI, I use the same id-file for several different accounts, so my > first workaround will be to split it up, so that each account has it's own > list - but still I wonder if this is a common problem. It will help only a tiny bit, because fetchmail splits the .fetchids file into several lists internally. Not worth the effort IMO. -- Matthias Andree |