From: Rainer W. <rwe...@ms...> - 2010-04-23 16:58:55
|
Hello. The company I work for uses fetchmail to download mails from various POP- and IMAP-servers in order to offer an anti-spam/ anti-virus scanning service for smartphone owners. Below is an oprofile excerpt from the 'download server' regarding fetchmail: ,---- | CPU: Core 2, speed 2393.92 MHz (estimated) | Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) [...] | samples % image name symbol name | 25110 51.5035 fetchmail save_str | 11399 23.3806 fetchmail id_find | 5851 12.0011 fetchmail str_in_list | 912 1.8706 fetchmail yylex | 558 1.1445 fetchmail initialize_saved_lists | 493 1.0112 fetchmail do_session `---- According to this, the program spent most of its time (51.5%) with executing the save_str-routine (uid.c). This routine appends a new item to a linked list of structures by traversing the list in order to find the last node and then adding the new one. It is [mostly] called in a couple of places in pop3.c to update the 'newsaved' and 'oldsaved' lists of message UIDs (UIDL). The query structure defined in fetchmail.h actually even contains a member which was supposed to track the end of the oldsaved list (oldsavedend) but that is only initialized (in initialize_saved_lists/ uid.c) and never really used for anything. The patch included below changes the code to actually track the end of the oldsaved list and introduces a newsavedend struct query member in order to do the same for the newsaved list. With this change, the save_str-routines now only accounts for 0.4% of the CPU time used by the program: ,---- | CPU: Core 2, speed 2393.92 MHz (estimated) | Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) [...] | samples % image name symbol name | 19146 48.1588 fetchmail id_find | 9416 23.6845 fetchmail str_in_list | 1452 3.6523 fetchmail yylex | 860 2.1632 fetchmail initialize_saved_lists | 810 2.0374 fetchmail do_session | | [...] | | 167 0.4201 fetchmail save_str | 157 0.3949 fetchmail parsecmdline | 145 0.3647 fetchmail readheaders `---- I am aware that the 'split append' (save_str + explicit adjustment of savedend-pointer) isn't exactly beautiful but it is a reasonably simple change with a significant positive impact for POP3 mailboxes containing lots of ('kept on server') messages. ---------- --- fetchmail/fetchmail.h 16 Apr 2010 14:45:15 -0000 1.1.1.3 +++ fetchmail/fetchmail.h 21 Apr 2010 16:31:34 -0000 1.1.1.3.2.1 @@ -378,7 +378,7 @@ unsigned int uid; /* UID of user to deliver to */ struct idlist *skipped; /* messages skipped on the mail server */ struct idlist *oldsaved, *newsaved; - struct idlist **oldsavedend; + struct idlist **oldsavedend, **newsavedend; char lastdigest[DIGESTLEN]; /* last MD5 hash seen on this connection */ char *folder; /* folder currently being polled */ --- fetchmail/pop3.c 16 Apr 2010 17:27:18 -0000 1.1.1.3 +++ fetchmail/pop3.c 21 Apr 2010 16:31:34 -0000 1.1.1.3.2.1 @@ -881,8 +881,10 @@ last_nr = try_nr; /* save it */ - newl = save_str(&ctl->oldsaved, id, UID_UNSEEN); + newl = save_str(ctl->oldsavedend, id, UID_UNSEEN); newl->val.status.num = try_nr; + + ctl->oldsavedend = &newl->next; } } if (outlevel >= O_DEBUG && last_nr <= count) @@ -970,15 +972,18 @@ /* the first try_id messages are known -> copy them to the newsaved list */ for( num = first_nr; num < list_len; num++ ) { - struct idlist *newl = save_str(&ctl->newsaved, + struct idlist *newl = save_str(ctl->newsavedend, str_from_nr_list(&ctl->oldsaved, num), UID_UNSEEN); newl->val.status.num = num - first_nr + 1; + + ctl->newsavedend = &newl->next; } if( nolinear ) { free_str_list(&ctl->oldsaved); ctl->oldsaved = 0; + ctl->oldsavedend = &ctl->oldsaved; last = try_id; } @@ -998,6 +1003,7 @@ (void)folder; /* Ensure that the new list is properly empty */ ctl->newsaved = (struct idlist *)NULL; + ctl->newsavedend = &ctl->newsaved; #ifdef MBOX /* Alain Knaff suggests this, but it's not RFC standard */ @@ -1089,9 +1095,11 @@ { struct idlist *old, *newl; - newl = save_str(&ctl->newsaved, id, UID_UNSEEN); + newl = save_str(ctl->newsavedend, id, UID_UNSEEN); newl->val.status.num = unum; + ctl->newsavedend = &newl->next; + if ((old = str_in_list(&ctl->oldsaved, id, FALSE))) { flag mark = old->val.status.mark; @@ -1123,7 +1131,8 @@ * swap the lists (say, due to socket error), * the same mail will not be downloaded again. */ - old = save_str(&ctl->oldsaved, id, UID_UNSEEN); + old = save_str(ctl->oldsavedend, id, UID_UNSEEN); + ctl->oldsavedend = &old->next; } /* save the number */ old->val.status.num = unum; @@ -1224,8 +1233,10 @@ } /* save it */ - newl = save_str(&ctl->oldsaved, id, UID_UNSEEN); + newl = save_str(ctl->oldsavedend, id, UID_UNSEEN); newl->val.status.num = num; + + ctl->oldsavedend = &newl->next; return(FALSE); } else --- fetchmail/transact.c 16 Apr 2010 14:45:18 -0000 1.1.1.2 +++ fetchmail/transact.c 21 Apr 2010 16:31:34 -0000 1.1.1.2.2.1 @@ -882,8 +882,10 @@ sscanf(line+12, "%s", id); if (!str_find( &ctl->newsaved, num)) { - struct idlist *newl = save_str(&ctl->newsaved,id,UID_SEEN); + struct idlist *newl = save_str(ctl->newsavedend,id,UID_SEEN); newl->val.status.num = num; + + ctl->newsavedend = &newl->next; } } } --- fetchmail/uid.c 17 Aug 2009 10:47:46 -0000 1.1.1.1 +++ fetchmail/uid.c 21 Apr 2010 16:31:34 -0000 1.1.1.1.2.1 @@ -119,6 +119,7 @@ ctl->oldsaved = (struct idlist *)NULL; ctl->newsaved = (struct idlist *)NULL; ctl->oldsavedend = &ctl->oldsaved; + ctl->newsavedend = &ctl->newsaved; } errno = 0; @@ -151,6 +152,7 @@ char saveddelim1; char *delimp2; char saveddelim2 = '\0'; /* pacify -Wall */ + struct idlist *old; while (fgets(buf, POPBUFSIZE, tmpfp) != (char *)NULL) { @@ -215,7 +217,8 @@ for (ctl = hostlist; ctl; ctl = ctl->next) { if (strcasecmp(host, ctl->server.queryname) == 0 && strcasecmp(user, ctl->remotename) == 0) { - save_str(&ctl->oldsaved, id, UID_SEEN); + old = save_str(ctl->oldsavedend, id, UID_SEEN); + ctl->oldsavedend = &old->next; break; } } @@ -547,7 +550,9 @@ if (outlevel >= O_DEBUG) report(stdout, GT_("swapping UID lists\n")); ctl->oldsaved = ctl->newsaved; + ctl->oldsavedend = ctl->newsavedend; ctl->newsaved = (struct idlist *) NULL; + ctl->newsavedend = &ctl->newsaved; free_str_list(&temp); } /* in fast uidl, there is no need to swap lists: the old state of @@ -581,6 +586,7 @@ report(stdout, GT_("discarding new UID list\n")); free_str_list(&ctl->newsaved); ctl->newsaved = (struct idlist *) NULL; + ctl->newsavedend = &ctl->newsaved; } } |
From: Matthias A. <mat...@gm...> - 2010-04-23 17:41:20
|
Rainer Weikusat wrote on 2010-04-21: > Hello. > > The company I work for uses fetchmail to download mails from various > POP- and IMAP-servers in order to offer an anti-spam/ anti-virus > scanning service for smartphone owners. Below is an oprofile excerpt > from the 'download server' regarding fetchmail: Hi Rainer, thanks for the analysis. You've bumped into a design flaw of fetchmail's UID handling. It has been on my TODO list for long (and using fetchmail with just 500 kept messages was painful), but it was a non-functional improvement, and the functional improvements (bug fixes) always "jumped the queue" :) Basically the whole uid.c outfit is using the wrong data structures for its purpose. It uses a linear list of n elements, and iterates over it 1.5n times, i. e. we get O(n^2) complexity (without constants). The whole stuff - I guess you figured as much - is meant to figure if a certain UID is in a set or not, and read/write that set to disk. We don't need order, only uniqueness. If we were being blunt and wanted a quick fix, we'd kill the list and use a vector (array) and realloc(), a flag "vector is sorted", and qsort and bsearch. That would bring things to O(n log n) If we were to be decent, we'd use the right structure for our purpose, and that's a hash table (in C++/STL i'd use a hash_set where available and a set where unavailable). We might want to keep things in a database on disk, rather than load/write the table for every run. Your patch effort is much appreicated, but however I have serious doubts about taking it. save_str and id_find are used together and basically O(n^2) complexity. Even if your patch shaves save_str down to O(n), id_find usage pattern itself remains O(n^2). > > ,---- > | CPU: Core 2, speed 2393.92 MHz (estimated) > | Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) [...] > | samples % image name symbol name > | 25110 51.5035 fetchmail save_str > | 11399 23.3806 fetchmail id_find > | 5851 12.0011 fetchmail str_in_list > | 912 1.8706 fetchmail yylex > | 558 1.1445 fetchmail initialize_saved_lists > | 493 1.0112 fetchmail do_session > `---- > > According to this, the program spent most of its time (51.5%) with > executing the save_str-routine (uid.c). This routine appends a new > item to a linked list of structures by traversing the list in order to > find the last node and then adding the new one. It is [mostly] called > in a couple of places in pop3.c to update the 'newsaved' and > 'oldsaved' lists of message UIDs (UIDL). The query structure defined > in fetchmail.h actually even contains a member which was supposed to > track the end of the oldsaved list (oldsavedend) but that is only > initialized (in initialize_saved_lists/ uid.c) and never really used > for anything. The patch included below changes the code to actually > track the end of the oldsaved list and introduces a newsavedend struct > query member in order to do the same for the newsaved list. With this > change, the save_str-routines now only accounts for 0.4% of the CPU > time used by the program: > > ,---- > | CPU: Core 2, speed 2393.92 MHz (estimated) > | Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) [...] > | samples % image name symbol name > | 19146 48.1588 fetchmail id_find > | 9416 23.6845 fetchmail str_in_list > | 1452 3.6523 fetchmail yylex > | 860 2.1632 fetchmail initialize_saved_lists > | 810 2.0374 fetchmail do_session > | > | [...] > | > | 167 0.4201 fetchmail save_str > | 157 0.3949 fetchmail parsecmdline > | 145 0.3647 fetchmail readheaders > `---- > > I am aware that the 'split append' (save_str + explicit adjustment of > savedend-pointer) isn't exactly beautiful but it is a reasonably > simple change with a significant positive impact for POP3 mailboxes > containing lots of ('kept on server') messages. > > ---------- > --- fetchmail/fetchmail.h 16 Apr 2010 14:45:15 -0000 1.1.1.3 > +++ fetchmail/fetchmail.h 21 Apr 2010 16:31:34 -0000 1.1.1.3.2.1 > @@ -378,7 +378,7 @@ > unsigned int uid; /* UID of user to deliver to */ > struct idlist *skipped; /* messages skipped on the mail server */ > struct idlist *oldsaved, *newsaved; > - struct idlist **oldsavedend; > + struct idlist **oldsavedend, **newsavedend; > char lastdigest[DIGESTLEN]; /* last MD5 hash seen on this > connection */ > char *folder; /* folder currently being polled */ > --- fetchmail/pop3.c 16 Apr 2010 17:27:18 -0000 1.1.1.3 > +++ fetchmail/pop3.c 21 Apr 2010 16:31:34 -0000 1.1.1.3.2.1 > @@ -881,8 +881,10 @@ > last_nr = try_nr; > /* save it */ > - newl = save_str(&ctl->oldsaved, id, UID_UNSEEN); > + newl = save_str(ctl->oldsavedend, id, UID_UNSEEN); > newl->val.status.num = try_nr; > + > + ctl->oldsavedend = &newl->next; > } > } > if (outlevel >= O_DEBUG && last_nr <= count) > @@ -970,15 +972,18 @@ > /* the first try_id messages are known -> copy them to the newsaved > list */ > for( num = first_nr; num < list_len; num++ ) > { > - struct idlist *newl = save_str(&ctl->newsaved, > + struct idlist *newl = save_str(ctl->newsavedend, > str_from_nr_list(&ctl->oldsaved, num), > UID_UNSEEN); > newl->val.status.num = num - first_nr + 1; > + > + ctl->newsavedend = &newl->next; > } > if( nolinear ) { > free_str_list(&ctl->oldsaved); > ctl->oldsaved = 0; > + ctl->oldsavedend = &ctl->oldsaved; > last = try_id; > } > @@ -998,6 +1003,7 @@ > (void)folder; > /* Ensure that the new list is properly empty */ > ctl->newsaved = (struct idlist *)NULL; > + ctl->newsavedend = &ctl->newsaved; > #ifdef MBOX > /* Alain Knaff suggests this, but it's not RFC standard */ > @@ -1089,9 +1095,11 @@ > { > struct idlist *old, *newl; > - newl = save_str(&ctl->newsaved, id, UID_UNSEEN); > + newl = save_str(ctl->newsavedend, id, UID_UNSEEN); > newl->val.status.num = unum; > + ctl->newsavedend = &newl->next; > + > if ((old = str_in_list(&ctl->oldsaved, id, FALSE))) > { > flag mark = old->val.status.mark; > @@ -1123,7 +1131,8 @@ > * swap the lists (say, due to socket error), > * the same mail will not be downloaded again. > */ > - old = save_str(&ctl->oldsaved, id, UID_UNSEEN); > + old = save_str(ctl->oldsavedend, id, UID_UNSEEN); > + ctl->oldsavedend = &old->next; > } > /* save the number */ > old->val.status.num = unum; > @@ -1224,8 +1233,10 @@ > } > /* save it */ > - newl = save_str(&ctl->oldsaved, id, UID_UNSEEN); > + newl = save_str(ctl->oldsavedend, id, UID_UNSEEN); > newl->val.status.num = num; > + > + ctl->oldsavedend = &newl->next; > return(FALSE); > } > else > --- fetchmail/transact.c 16 Apr 2010 14:45:18 -0000 1.1.1.2 > +++ fetchmail/transact.c 21 Apr 2010 16:31:34 -0000 1.1.1.2.2.1 > @@ -882,8 +882,10 @@ > sscanf(line+12, "%s", id); > if (!str_find( &ctl->newsaved, num)) > { > - struct idlist *newl = save_str(&ctl->newsaved,id,UID_SEEN); > + struct idlist *newl = save_str(ctl->newsavedend,id,UID_SEEN); > newl->val.status.num = num; > + > + ctl->newsavedend = &newl->next; > } > } > } > --- fetchmail/uid.c 17 Aug 2009 10:47:46 -0000 1.1.1.1 > +++ fetchmail/uid.c 21 Apr 2010 16:31:34 -0000 1.1.1.1.2.1 > @@ -119,6 +119,7 @@ > ctl->oldsaved = (struct idlist *)NULL; > ctl->newsaved = (struct idlist *)NULL; > ctl->oldsavedend = &ctl->oldsaved; > + ctl->newsavedend = &ctl->newsaved; > } > errno = 0; > @@ -151,6 +152,7 @@ > char saveddelim1; > char *delimp2; > char saveddelim2 = '\0'; /* pacify -Wall */ > + struct idlist *old; > while (fgets(buf, POPBUFSIZE, tmpfp) != (char *)NULL) > { > @@ -215,7 +217,8 @@ > for (ctl = hostlist; ctl; ctl = ctl->next) { > if (strcasecmp(host, ctl->server.queryname) == 0 > && strcasecmp(user, ctl->remotename) == 0) { > - save_str(&ctl->oldsaved, id, UID_SEEN); > + old = save_str(ctl->oldsavedend, id, UID_SEEN); > + ctl->oldsavedend = &old->next; > break; > } > } > @@ -547,7 +550,9 @@ > if (outlevel >= O_DEBUG) > report(stdout, GT_("swapping UID lists\n")); > ctl->oldsaved = ctl->newsaved; > + ctl->oldsavedend = ctl->newsavedend; > ctl->newsaved = (struct idlist *) NULL; > + ctl->newsavedend = &ctl->newsaved; > free_str_list(&temp); > } > /* in fast uidl, there is no need to swap lists: the old state of > @@ -581,6 +586,7 @@ > report(stdout, GT_("discarding new UID list\n")); > free_str_list(&ctl->newsaved); > ctl->newsaved = (struct idlist *) NULL; > + ctl->newsavedend = &ctl->newsaved; > } > } > _______________________________________________ > fetchmail-devel mailing list > fet...@li... > https://lists.berlios.de/mailman/listinfo/fetchmail-devel -- Matthias Andree |
From: Rainer W. <rwe...@ms...> - 2010-04-23 19:10:21
|
"Matthias Andree" <mat...@gm...> writes: > Rainer Weikusat wrote on 2010-04-21: >> The company I work for uses fetchmail to download mails from various >> POP- and IMAP-servers in order to offer an anti-spam/ anti-virus >> scanning service for smartphone owners. Below is an oprofile excerpt >> from the 'download server' regarding fetchmail: [...] > You've bumped into a design flaw of fetchmail's UID handling. I assume it is rather an instance of 'Learn Lisp. It will make you a better programmer for life' :->. [...] > If we were being blunt and wanted a quick fix, we'd kill the list and > use a vector (array) and realloc(), a flag "vector is sorted", and > qsort and bsearch. That would bring things to O(n log n) > > If we were to be decent, we'd use the right structure for our purpose, > and that's a hash table (in C++/STL i'd use a hash_set where > available and a set where unavailable). We might want to keep things > in a database on disk, rather than load/write the table for every > run. > > Your patch effort is much appreicated, but however I have serious > doubts about taking it. save_str and id_find are used together and > basically O(n^2) complexity. Even if your patch shaves save_str down > to O(n), id_find usage pattern itself remains O(n^2). struct idlist * is quite pervasive in the code. It is also used for many 'short lists' (eg, ctl->localnames) where a linked list is at least good enough. Keeping track of the last item on the oldsaved and newsaved lists was just a reasonably non-invasive, fairly quick fix for the 'worst offender' case (and the IMHO most braindead ...) I could smuggle past my boss[*] as 'immediately useful and not that much effort'. [*] I am actually supposed to add UID-support the IMAP protocol driver and this case of 'walk until your shoes fall off' just angered me ... |
From: Matthias A. <mat...@gm...> - 2010-04-23 20:04:16
Attachments:
fetchmail-patch740.diff
|
Rainer Weikusat wrote on 2010-04-23: > "Matthias Andree" <mat...@gm...> writes: >> Rainer Weikusat wrote on 2010-04-21: >>> The company I work for uses fetchmail to download mails from various >>> POP- and IMAP-servers in order to offer an anti-spam/ anti-virus >>> scanning service for smartphone owners. Below is an oprofile excerpt >>> from the 'download server' regarding fetchmail: > > [...] > >> You've bumped into a design flaw of fetchmail's UID handling. > > I assume it is rather an instance of 'Learn Lisp. It will make you a > better programmer for life' :->. Uh, don't tell me. I've known that when my Debian installation fit onto the Conner CP30344 disk in my i486SX25. Tell that to Eric. And tell him LAST was an abomination and defunct-upon-invention. ;) No, don't tell him. I guess he'd do it in Python today, use a dictionary, and leave optimization up to the Python interpreter, and that might be reasonable, unless we'd have to have SSL idiosyncrasies under control. getmail (Python based) used to not complain about expired certificates, and all Charles had to say was "sorry, not my fault". > [...] > >> If we were being blunt and wanted a quick fix, we'd kill the list and >> use a vector (array) and realloc(), a flag "vector is sorted", and >> qsort and bsearch. That would bring things to O(n log n) >> >> If we were to be decent, we'd use the right structure for our purpose, >> and that's a hash table (in C++/STL i'd use a hash_set where >> available and a set where unavailable). We might want to keep things >> in a database on disk, rather than load/write the table for every >> run. >> >> Your patch effort is much appreicated, but however I have serious >> doubts about taking it. save_str and id_find are used together and >> basically O(n^2) complexity. Even if your patch shaves save_str down >> to O(n), id_find usage pattern itself remains O(n^2). > > struct idlist * is quite pervasive in the code. It is also used for > many 'short lists' (eg, ctl->localnames) where a linked list is at > least good enough. Keeping track of the last item on the oldsaved and > newsaved lists was just a reasonably non-invasive, fairly quick fix for > the 'worst offender' case (and the IMHO most braindead ...) I could > smuggle past my boss[*] as 'immediately useful and not that much > effort'. Yes. Only this is one of the makeshift solutions that is not bound to last long. GMX split my INBOX when it hit 50,000 messages. Oops. I do NOT want to poll that with fetchmail <= 6.3.16. :) Seriously, how about this approach: 1. create uid_save and uid_find wrappers around save_str and id_find and provide a typedef 2. change POP3 and possibly IMAP (see below) to use that => decouple from the implementation 3. replace uid_save/uid_find implementations by something decent? This way all the short lists can continue to use the save_str/id_find cruft for the nonce, and the bulk can use a O(n log n) implementation. > [*] I am actually supposed to add UID-support the IMAP > protocol driver and this case of 'walk until your shoes fall > off' just angered me ... Good to know. We have a patch in the berlios tracker [1], which is halfway there, but you need to add UIDVALIDITY support [2], and probably a few touch-ups because the patch ancestor and current fetchmail code have diverged. Oh, and of course we can add your company name to the NEWS file, too. Sponsored by... written by... <hint, hint> :) Note I'm planning to change the license to Affero GPL v3 in the not too distant future. [1] <http://developer.berlios.de/patch/?func=detailpatch&patch_id=740&group_id=1824> also attached [2] <http://www.rfc-editor.org/rfc/rfc3501.txt> 2.3.1.1. Unique Identifier (UID) Message Attribute Just in case BerliOS drops off the net, find the patch attached (may not be available on the list). Note again: not ready for use! HTH -- Matthias Andree |
From: Rainer W. <rwe...@ms...> - 2010-05-03 14:33:17
|
Sunil Shetye <sh...@bo...> writes: > Quoting from Rainer Weikusat's mail on Mon, May 03, 2010: >> > Matthias, you have said that the regular uidl code is faster than the >> > fastuidl code for 10000 uids. Do you mean while processing 10000 uids >> > on the fetchmail side or while getting 10000 uids from the server? >> > >> > The fastuidl code downloads less UIDs from the server and hence has to >> > compare less UIDs to check for new mail. If there are very few new >> > mails, then the fastuidl code is expected to save a lot on the string >> > comparisons. >> >> It is conceivable that the string comparisons take less wallclock >> time than repeatedly asking a server for an UID and waiting for the >> replies does. IMHO, this is still a bad tradeoff because 'e-mail >> download' is a background batch job and even on a single-user >> workstation, minizing e-mail download time at the expense of directing >> more resources away from the interactive jobs the user interacts with >> doesn't really make sense except if one is using a uplink whose cost >> is proportional to the time spent 'online'. > > As Matthias has already said, the better solution is to use the > appropriate data structures like trees or hashes or vectors in order > to cut down both searches and modifications. Sorry, but you are seriously lost in space here. I have no idea which thought could have motivated you to reply with this to my attempt at an explanation of the 'faster with 10000 UIDs' statement. |
From: Rainer W. <rwe...@ms...> - 2010-05-09 23:02:43
|
Rainer Weikusat <rwe...@ms...> writes: > Matthias Andree <mat...@gm...> writes: > > [...] > >> it also pretended that someone had already fixed the complexity of >> the UID handling (which hasn't happened to the best of my knowledge; >> I have started hacking a bit on it, but not sure if I'll make it, or >> drop it and leave it for a later release using C++, if it turns out >> it's too much of a hassle to do in C). > > I am done with my last 'firebrigade' assignment which means that I > will now continue to work on the UID support for imap I'm now the 'proud owner' of a fetchmail variant which works (mostly) like a proper 'disconnected imap client'. The number of POP3 accounts registered with the product I am using fetchmail for (except pulling my mail of my employer's mailserver) has grown to 99 and can be expected to increase further. Because of this, I'm determined to get rid of the present UID management code 'soon' (probably, during the next couple of 'Sunday shifts'). I've been a content fetchmail user ever since I first got 'internet' in 1999 and I would happily direct some useful work into it because of this. OTOH, my experience with OSS projects so far is a) one gets ignored, b) all kinds of ml lurkers start to enage in wild flaming and c) after the need has arisen, the powers-that-be grudingly implement the missing features themselves, just to ensure that only the worthy ones get any 'due credit', and this means some amount of additional work for me (of which I already have plenty). So, is there a chance for such a change to be accepted? |
From: Matthias A. <mat...@gm...> - 2010-05-10 10:31:08
|
Am 09.05.2010 23:02, schrieb Rainer Weikusat: > Rainer Weikusat <rwe...@ms...> writes: >> Matthias Andree <mat...@gm...> writes: >> >> [...] >> >>> it also pretended that someone had already fixed the complexity of >>> the UID handling (which hasn't happened to the best of my knowledge; >>> I have started hacking a bit on it, but not sure if I'll make it, or >>> drop it and leave it for a later release using C++, if it turns out >>> it's too much of a hassle to do in C). >> >> I am done with my last 'firebrigade' assignment which means that I >> will now continue to work on the UID support for imap > > I'm now the 'proud owner' of a fetchmail variant which works (mostly) > like a proper 'disconnected imap client'. The number of POP3 accounts > registered with the product I am using fetchmail for (except pulling > my mail of my employer's mailserver) has grown to 99 and can be > expected to increase further. Sounds good, although I'm not quite sure if you're working on two distinct issues here (DIMAP and POP3 improvements). > Because of this, I'm determined to get > rid of the present UID management code 'soon' (probably, during the > next couple of 'Sunday shifts'). I've been a content fetchmail user > ever since I first got 'internet' in 1999 and I would happily direct > some useful work into it because of this. OTOH, my experience with OSS > projects so far is a) one gets ignored, b) all kinds of ml lurkers > start to enage in wild flaming and c) after the need has arisen, the > powers-that-be grudingly implement the missing features themselves, > just to ensure that only the worthy ones get any 'due credit', and this > means some amount of additional work for me (of which I already have > plenty). So, is there a chance for such a change to be accepted? Hi Rainer, yes, I think there is, the only substantial concern I have is that of copyright and licensing. There will be proper credit to the copyright holder in the NEWS and/or THANKS files, and if the author is distinct from the copyright owner, this could read "(C) Copyright 2010 ACME Software Writing, written by Corey Coder", or similar. As long as I don't need to include your marketing brochure or slogan, but just company name and place, that's fine. Given this will likely be a nontrivial contribution according to our German Urheberrechtsgesetz, and it apparently touches interests of your employer, we need: - to make sure that your employer either waives his copyright and allows you to claim it, or your employer is willing to contribute under a compatible license (GPL "v2 or later" preferred). I am not trained in law and justice sciences, but my take of German copyright law is that even if you're doing things in the spare time that are closely related to your employment, your employer can still claim the so-called Urheberschaft. - state whether coverage under the Affero GNU GPL v3 would be OK, because I plan to let future versions use that license unless the price to be paid (in ripping out features where no such licensing is possible) would be too high, in that case I'd try to use GNU GPL v3 in the long run. With respect to the concerns you've raised, (a) we're already talking, but I've not taken your past patch because of technical concerns (as discussed earlier), (b) is not something that has happened in the past few years on the fetchmail list, and I trust Rob MacGregor as list operator to politely, but also distinctly tell people not to go into such unfruitful discussions (or bikeshedding); and he has my full backing for that, (c) is a concern if we cannot reach an agreement on the licenses. I feel no need to adorn myself with somebody else's plumes, continued maintenance for half a decade, fixing more than 200 bugs, speaks for itself. I'd be grateful if we could make such DIMAP and scalable POP3 possible, and I thank you for letting me know, because I have plans to attack POP3/UIDL scalability myself soon -- I could devote my time in other ways if I know your work covers that and reduces complexity to O(n log n) worst-case. Please let me know your decision, or if there are major delays to be expected because your bosses haven't got time to decide the licensing soonish. Looking forward to your reply. Best regards Matthias -- Matthias Andree |
From: Matthias A. <mat...@gm...> - 2010-04-23 19:44:53
|
Rainer Weikusat wrote on 2010-04-21: > Hello. > > The company I work for uses fetchmail to download mails from various > POP- and IMAP-servers in order to offer an anti-spam/ anti-virus > scanning service for smartphone owners. Below is an oprofile excerpt > from the 'download server' regarding fetchmail: [whoops, I meant to store a draft when I figured it wasn't a 2 minute reply, but figured that the message was sent out rather than stored as draft, sorry for that] Hi Rainer, thanks for the analysis. You've bumped into a (known) design flaw of fetchmail's UID handling. It has been on the TODO list for ~2 years, but since it's not a functional bug, the bug fixes always got priority. Basically the whole uid.c outfit is using the wrong data structures for its purpose. The whole stuff - I guess you figured as much - is meant to figure if a certain UID was seen, i. e. is in a set or not, and read/write that set from/to disk. We don't need order, only uniqueness, but we do need quick access, and the size is dynamic. uid.c currently uses a linear list of n elements, and iterates over it roughly 1.5n times, i. e. we get O(n^2) complexity (without constant factors). Your patch, albeit it helps a bit in that it shaves save_str down to O(n), does not fix the fundamental complexity problem: the id_find() stuff and with it the whole UID setup, still remains O(n^2) the way we're using it, only reduced by a constant factor of, guessing optimistially, 2-3. I have some ideas for a proper solution: (a) blunt: instead of the list, use an array with realloc, qsort and bsearch. Ugly, but probably portable to anything that can compile fetchmail today. (b) halfway: depend on <search.h>, use tsearch(3). Not the ultimate in performance, but easy to use and down to O(n log n) complexity, which helps quite a bit already. (hcreate(3) has a fixed size and supports only one table per process, which are showstoppers.) Deemed safe for fetchmail 6.3. Even better options: (c) Use C++ and optionally Boost with unordered_set. Fallback options are hash_set and set. Recent 6.3 versions are already intelligible as C and C++, so it's a gradual move with one initial bump that pulls in the dependencies and switches to C++. (d) use a disk-based hashed or btree database. I don't know the optimal library though, I've been through this in bogofilter. sqlite3 pulls in SQL overhead (it's not that bad, but still) we don't need, but otherwise quite robust. Berkeley DB has decent documentation, STL API and is bullet proof -- but there are two nukes that kill it: (1) users messing with the files and removing log files, (2) distributions upgrading libdb and application without telling the user. It's also a nightmare to autodetect. TokyoCabinet might be a candidate, but its API is pretty raw on the bits and I trust its future less than all other databases. I am very inclined to go for (c) or (d) in fetchmail 6.4 and I think I'm determined to go for C++ as that allows much conciser code. I am loathe to change 6.3 uid.c at all, as basically it's beyond repair and needs to be redone from scratch - which is not adequate for a "stable" version. 6.4 should split .fetchids to per-account files (we used to have patches but never deployed them in baseline code), which greatly simplifies uid list handling. See patches at <http://home.pages.de/~mandree/fetchmail/>. > I am aware that the 'split append' (save_str + explicit adjustment of > savedend-pointer) isn't exactly beautiful but it is a reasonably > simple change with a significant positive impact for POP3 mailboxes > containing lots of ('kept on server') messages. If a constant factor of - best case guess - 3 is "significant", then yes - but it still doesn't scale and your gain is lost again if you increase the number of messages by 40 - 80 %. The moment I saw your patch I immediately thought of AmigaOS (Kickstart) exec.library "struct List" and "struct Node" (AROS equivalent at [1]) which would solve this problem in a beautiful way -- albeit without fixing the fundamental performance problem. I hope you're not too disappointed that I'm loathe to take your patch and that you can understand my reasoning. Best regards Matthias [1] <http://aros.sourceforge.net/documentation/developers/app-dev/exec-library.php#list-manipulating-functions> -- Matthias Andree |
From: Matthias A. <mat...@gm...> - 2010-04-24 06:40:05
Attachments:
signature.asc
|
Please check out http://gitorious.org/fetchmail/fetchmail/commits/for-rainer/ You can download this as tarball, but you still need to follow the build-from-Git instructions (basically, run autoreconf -isv first). It contains a lightweight version of your patch (see the two latest commits). The fastuidl fix is less efficient, but also less needed since the amount of UIDs added is ld N there, so we get O(n log n) complexity for parsing the UIDL responses. The UIDL fix is O(n) for the actual insertion, but remains O(n^2) for "message seen yet?" detection. I wonder if I should kill fastuidl in fetchmail 6.4. It has quite a few quirks and is only useful on low-bandwidth low-latency links. DSL, GSM, GPRS and thereabouts are high-latency. Example: On DSL a line w/ 6 MBit/s and 40 ms round trips, the regular UIDL code is still faster for 10000 UIDs. The tree above also contains a few cleanups (idlist functions were split out from uid.c to a new idlist.c file). CPU is a different issue; to fix this for good, unless I'm mistaken, we need to split the message flags from the UID linked list. The former goes in an array (which needs to store only the mark byte per message), the latter into a btree (UID -> number). I may need to kill the slow UIDL emulation code (header-/Message-ID based) along the way, if so, the speedup will hardly be fit for 6.3.X, but rather for 6.4.X. HTH Matthias |
From: Rainer W. <rwe...@ms...> - 2010-04-25 23:29:23
|
Matthias Andree <mat...@gm...> writes: > http://gitorious.org/fetchmail/fetchmail/commits/for-rainer/ > > You can download this as tarball, but you still need to follow the > build-from-Git instructions (basically, run autoreconf -isv first). > > It contains a lightweight version of your patch (see the two latest > commits). The fastuidl fix is less efficient, but also less needed since > the amount of UIDs added is ld N there, so we get O(n log n) complexity > for parsing the UIDL responses. The UIDL fix is O(n) for the actual > insertion, but remains O(n^2) for "message seen yet?" detection. This looks very suspiciously like my first attempt at making a 'quick' improvement to this code. It will yield about half of the benefit of the patch I sent (for 'our' application), because this method cannot be used for the fastuidl path in pop3_is_old (as you wrote, I profiled this before making the more complicated change). It is also (sorry for being so blunt) not very well done. Because of the /* do it nonrecursively so the list is in the right order */ for (end = idl; *end; end = &(*end)->next) continue; in save_str_quick, the value passed to the routine should be the address of the pointer which needs to be changed when appending to the list, eg, assuming the 'savep' change: --------- int ok; unsigned int first_nr, last_nr, try_nr; char id [IDLEN+1]; struct idlist *savep = NULL; /** pointer to cache save_str result, speeds up saves */ first_nr = 0; last_nr = count + 1; [...] last_nr = try_nr; /* save it */ savep = save_str(savep ? &savep : &ctl->oldsaved, id, UID_UNSEEN); savep->val.status.num = try_nr; --------- a better implementation would be -------------------------- int ok; unsigned int first_nr, last_nr, try_nr; char id [IDLEN+1]; struct idlist **svp, *sv; first_nr = 0; last_nr = count + 1; svp = &ctl->oldsaved; [...] last_nr = try_nr; /* save it */ sv = save_str(svp, id, UID_UNSEEN); sv->val.status.num = try_nr; svp = &sv->next; --------------------------- which has the nice bonus property that it doesn't have the conditional which goes one way during the first iteration and the other way during all that follow inside the loop (the idea isn't mine, I originally learnt about it because of some years-old USENET posting of a guy whose name I've unfortunately forgotten). I am also rather concerned about use of bandwidth than speed. Presently, I am dealing with 76 POP3 accounts and about a meg of stored UIDs I'd need to download every five minutes in order to determine that presently, nothing needs to be downloaded (this refers to fastuidl in general, assuming I understood the principle correctly without really analyzing the code). So, thank you very much for your effort, but I'll stick with the version in the private fork I am anyway maintaining because of the additional features (like 'object-oriented/ vtabled sinks') whose usefulness would be very limited outside this particular application and/or whose implementation is rather 'commercial' (eg #ifdefing away the concurrency control code) than aesthetically/ technically pleasing. The savedend-change was just something I considered to be more generally useful, hence I 'backported' it to 6.3.16 from my HEAD and sent it to the list. BTW, I had a look at the UIDL patch and whoever wrote that should probably spend some time reading RFC4549 (Synchronization Operations for Disconnected IMAP4 Clients) which is what I am going to add to the imap-code because using the \Seen flag as 'message is old' indicator [reportedly] interferes with the gmail web interface. NB: Nothing of this text is meant as an insult even if it may sound like one. I am a computer person and not a person person and my knowledge regarding 'how to deal with other human beings' is still very limited, not the least because 'other human beings' are usually drunk, male persons desiring to hit me because I have (again) done something wrongly I neither understood nor recognized. |
From: Matthias A. <mat...@gm...> - 2010-04-26 16:18:58
|
Rainer Weikusat wrote on 2010-04-26: > This looks very suspiciously like my first attempt at making a 'quick' > improvement to this code. It will yield about half of the benefit of > the patch I sent (for 'our' application), because this method cannot > be used for the fastuidl path in pop3_is_old (as you wrote, I profiled > this before making the more complicated change). It is also (sorry for > being so blunt) not very well done. Because of the > > /* do it nonrecursively so the list is in the right order */ > for (end = idl; *end; end = &(*end)->next) > continue; > > in save_str_quick, the value passed to the routine should be the > address of the pointer which needs to be changed when appending to the > list, eg, assuming the 'savep' change: Yeah, we can save that last single iteration from end to &end->next and go all the way, and we can use a proper initializer. Would make only a minor difference though as long as we're still doing linear searches. > a better implementation would be > > svp = &ctl->oldsaved; > > [...] > > last_nr = try_nr; > /* save it */ > sv = save_str(svp, id, UID_UNSEEN); > sv->val.status.num = try_nr; > svp = &sv->next; > --------------------------- Thanks. > which has the nice bonus property that it doesn't have the conditional > which goes one way during the first iteration and the other way during > all that follow inside the loop (the idea isn't mine, I originally > learnt about it because of some years-old USENET posting of a guy > whose name I've unfortunately forgotten). I am also rather concerned > about use of bandwidth than speed. Presently, I am dealing with 76 > POP3 accounts and about a meg of stored UIDs I'd need to download > every five minutes in order to determine that presently, nothing needs > to be downloaded (this refers to fastuidl in general, assuming I > understood the principle correctly without really analyzing the code). Makes sense, although I'm wondering if it really makes that much of a difference for fastuidl. fastuidl appears to be opportunistically harvesting message numbers, I wonder if that's of any use. I think Sunil wrote that fastUIDL code, I'm Cc:ing him. > So, thank you very much for your effort, but I'll stick with the > version in the private fork I am anyway maintaining because of the > additional features (like 'object-oriented/ vtabled sinks') whose > usefulness would be very limited outside this particular > application and/or whose implementation is rather 'commercial' (eg > #ifdefing away the concurrency control code) than aesthetically/ > technically pleasing. Not sure I get your point. I've always wanted to abstract the sink code the way the fetch protocols are abstracted, and that was also one thing I'd tried to squeeze from the 2008 Google Summer of Code that was supposed to provide MAPI (but apparently isn't fit for integration, and apparently stalled). > The savedend-change was just something I > considered to be more generally useful, hence I 'backported' it to > 6.3.16 from my HEAD and sent it to the list. Much appreciated. > BTW, I had a look at the UIDL patch and whoever wrote that should > probably spend some time reading RFC4549 (Synchronization Operations > for Disconnected IMAP4 Clients) which is what I am going to add to the > imap-code because using the \Seen flag as 'message is old' indicator > [reportedly] interferes with the gmail web interface. I'd appreciate if such code could be made public. The server-side \Seen tracking is something I have wanted fetchmail to get rid of for long. The long-winded way with "smtp_someop() { if (I'm surprisingly using mda) mda_someop() else if (using bsmtp) else if (using lmtp) }" is garbage, inconcise, inefficient -- and it appears from your description you've fixed just that already. > NB: Nothing of this text is meant as an insult even if it may sound > like one. No offense taken. I can usually tell the difference between criticing my work objectively, subjectively, and ad-hominem attacks. :) -- Matthias Andree |
From: Sunil S. <sh...@bo...> - 2010-04-30 10:58:19
|
Hi, > >which has the nice bonus property that it doesn't have the conditional > >which goes one way during the first iteration and the other way during > >all that follow inside the loop (the idea isn't mine, I originally > >learnt about it because of some years-old USENET posting of a guy > >whose name I've unfortunately forgotten). I am also rather concerned > >about use of bandwidth than speed. Presently, I am dealing with 76 > >POP3 accounts and about a meg of stored UIDs I'd need to download > >every five minutes in order to determine that presently, nothing needs > >to be downloaded (this refers to fastuidl in general, assuming I > >understood the principle correctly without really analyzing the code). > > Makes sense, although I'm wondering if it really makes that much of > a difference for fastuidl. > fastuidl appears to be opportunistically harvesting message numbers, > I wonder if that's of any use. > I think Sunil wrote that fastUIDL code, I'm Cc:ing him. The whole idea of the fastuidl code (and the unrelated fetchsizelimit code) is to get the first mail fast. Previously, fetchmail would get all the UIDs and sizes right at the start of the transaction. POP3> STAT POP3< +OK 10000 ... POP3> UIDL (gets 10000 uids) (10000 is the first unseen) POP3> LIST (gets 10000 sizes) With fastuidl and the delayed size information, the fetching of the first mail is faster. POP3> STAT POP3< +OK 10000 ... POP3> UIDL 5000 POP3> UIDL 7500 ... POP3> UIDL 10000 (10000 is the first unseen) POP3> LIST 10000 These changes were done after observing that fetchmail had trouble downloading even one mail over a slow connection, was effectively going into an infinite loop, and was causing the associated wastage in bandwidth. Matthias, you have said that the regular uidl code is faster than the fastuidl code for 10000 uids. Do you mean 10000 uids in the .fetchids file or 10000 uids (i.e. 10000 mails) on the server? The fastuidl code is expected to be faster if there are very few new mails on the server and fetchmail is running with the keep option on. As in the above case, there are 10000 mails, but only one new mail. Please also compare how fast the *first* new mail is delivered with and without fastuidl. Of course, to stop the use of fastuidl, you may add 'fastuidl 0' to the fetchmailrc file. -- Sunil Shetye. |
From: Sunil S. <sh...@bo...> - 2010-05-03 10:13:11
|
[ Correction in previous mail ] Quoting from Sunil Shetye's mail on Fri, Apr 30, 2010: > Matthias, you have said that the regular uidl code is faster than the > fastuidl code for 10000 uids. Do you mean 10000 uids in the .fetchids > file or 10000 uids (i.e. 10000 mails) on the server? [ This should be read as ] Matthias, you have said that the regular uidl code is faster than the fastuidl code for 10000 uids. Do you mean while processing 10000 uids on the fetchmail side or while getting 10000 uids from the server? The fastuidl code downloads less UIDs from the server and hence has to compare less UIDs to check for new mail. If there are very few new mails, then the fastuidl code is expected to save a lot on the string comparisons. -- Sunil Shetye. |
From: Rainer W. <rwe...@ms...> - 2010-05-03 12:08:41
|
Sunil Shetye <sh...@bo...> writes: > Quoting from Sunil Shetye's mail on Fri, Apr 30, 2010: >> Matthias, you have said that the regular uidl code is faster than the >> fastuidl code for 10000 uids. Do you mean 10000 uids in the .fetchids >> file or 10000 uids (i.e. 10000 mails) on the server? > > [ This should be read as ] > > Matthias, you have said that the regular uidl code is faster than the > fastuidl code for 10000 uids. Do you mean while processing 10000 uids > on the fetchmail side or while getting 10000 uids from the server? > > The fastuidl code downloads less UIDs from the server and hence has to > compare less UIDs to check for new mail. If there are very few new > mails, then the fastuidl code is expected to save a lot on the string > comparisons. It is conceivable that the string comparisons take less wallclock time than repeatedly asking a server for an UID and waiting for the replies does. IMHO, this is still a bad tradeoff because 'e-mail download' is a background batch job and even on a single-user workstation, minizing e-mail download time at the expense of directing more resources away from the interactive jobs the user interacts with doesn't really make sense except if one is using a uplink whose cost is proportional to the time spent 'online'. |
From: Sunil S. <sh...@bo...> - 2010-05-03 13:59:48
|
Quoting from Rainer Weikusat's mail on Mon, May 03, 2010: > > Matthias, you have said that the regular uidl code is faster than the > > fastuidl code for 10000 uids. Do you mean while processing 10000 uids > > on the fetchmail side or while getting 10000 uids from the server? > > > > The fastuidl code downloads less UIDs from the server and hence has to > > compare less UIDs to check for new mail. If there are very few new > > mails, then the fastuidl code is expected to save a lot on the string > > comparisons. > > It is conceivable that the string comparisons take less wallclock > time than repeatedly asking a server for an UID and waiting for the > replies does. IMHO, this is still a bad tradeoff because 'e-mail > download' is a background batch job and even on a single-user > workstation, minizing e-mail download time at the expense of directing > more resources away from the interactive jobs the user interacts with > doesn't really make sense except if one is using a uplink whose cost > is proportional to the time spent 'online'. As Matthias has already said, the better solution is to use the appropriate data structures like trees or hashes or vectors in order to cut down both searches and modifications. Once done, regular uidl and fastuidl will not eat up resources. Note that fastuidl is not the source of this problem. -- Sunil Shetye. |
From: Matthias A. <mat...@gm...> - 2010-05-04 01:48:02
Attachments:
signature.asc
|
Am 03.05.2010 14:33, schrieb Rainer Weikusat: > Sunil Shetye <sh...@bo...> writes: >> Quoting from Rainer Weikusat's mail on Mon, May 03, 2010: >>>> Matthias, you have said that the regular uidl code is faster than the >>>> fastuidl code for 10000 uids. Do you mean while processing 10000 uids >>>> on the fetchmail side or while getting 10000 uids from the server? >>>> >>>> The fastuidl code downloads less UIDs from the server and hence has to >>>> compare less UIDs to check for new mail. If there are very few new >>>> mails, then the fastuidl code is expected to save a lot on the string >>>> comparisons. >>> >>> It is conceivable that the string comparisons take less wallclock >>> time than repeatedly asking a server for an UID and waiting for the >>> replies does. IMHO, this is still a bad tradeoff because 'e-mail >>> download' is a background batch job and even on a single-user >>> workstation, minizing e-mail download time at the expense of directing >>> more resources away from the interactive jobs the user interacts with >>> doesn't really make sense except if one is using a uplink whose cost >>> is proportional to the time spent 'online'. >> >> As Matthias has already said, the better solution is to use the >> appropriate data structures like trees or hashes or vectors in order >> to cut down both searches and modifications. > > Sorry, but you are seriously lost in space here. I have no idea which > thought could have motivated you to reply with this to my attempt at > an explanation of the 'faster with 10000 UIDs' statement. Guys, I was making that confusing statement, and it referred only to the network and wallclock consideration, assumed networks with big bandwidth-delay product, and it also pretended that someone had already fixed the complexity of the UID handling (which hasn't happened to the best of my knowledge; I have started hacking a bit on it, but not sure if I'll make it, or drop it and leave it for a later release using C++, if it turns out it's too much of a hassle to do in C). fastuidl will reduce the network traffic considerably, and will also save time on low-latency links. Whether we do fastuidl or not, for any nontrivial amounts of messages kept on the server for POP3, we need to fix the code to get rid of the O(n^2) complexity. |
From: Rainer W. <rwe...@ms...> - 2010-05-07 16:19:11
|
Matthias Andree <mat...@gm...> writes: [...] > it also pretended that someone had already fixed the complexity of > the UID handling (which hasn't happened to the best of my knowledge; > I have started hacking a bit on it, but not sure if I'll make it, or > drop it and leave it for a later release using C++, if it turns out > it's too much of a hassle to do in C). I am done with my last 'firebrigade' assignment which means that I will now continue to work on the UID support for imap (which is presently at the state that it uses UIDNEXT to detect new mails while still using \Seen for determining their message sequence numbers and to distinguish between old and new mails). The UID code isn't really suitable for public consumption and won't become this without a major additional effort (and given that my regular worktime is presently around 53 hours/week, I won't have the time to do that). I will also hopefully be able to get rid of the linked-list representation for the POP3 UIDL database at latest until end of June (I hope), possibly earlier, and I would be willing to base this on a 'public' fetchmail-tree and share it if their was any interest in that (I am also supposed to work on a completely different project which is supposed to be demoed in June, but I think that I will be able to interleave both). |
From: Rainer W. <rwe...@ms...> - 2010-05-13 23:24:41
|
Rainer Weikusat <rwe...@ms...> writes: > Matthias Andree <mat...@gm...> writes: [...] >> - to make sure that your employer either waives his copyright and >> allows you to claim it, or your employer is willing to contribute >> under a compatible license (GPL "v2 or later" preferred). [...] > the usual answer whenever I asked for something like this in the > past was 'sure'. [...] > I could probably devote at least an hour of (unpaid) overtime to this > per day, meaning, a usable implementation should be available at the > end of this week or the beginning of next week. Just as a quick update on that: I've meanwhile talked to my boss about this and I am free to decide on a suitable license for publishment according to whatever makes the most sense and I am allowed to spend some amount of 'recreational programming time' on this per day. |
From: Matthias A. <mat...@gm...> - 2010-05-14 09:49:40
|
Am 13.05.2010, 23:24 Uhr, schrieb Rainer Weikusat: > Rainer Weikusat <rwe...@ms...> writes: >> Matthias Andree <mat...@gm...> writes: > > [...] > >>> - to make sure that your employer either waives his copyright and >>> allows you to claim it, or your employer is willing to contribute >>> under a compatible license (GPL "v2 or later" preferred). > > [...] > >> the usual answer whenever I asked for something like this in the >> past was 'sure'. > > [...] > >> I could probably devote at least an hour of (unpaid) overtime to this >> per day, meaning, a usable implementation should be available at the >> end of this week or the beginning of next week. > > Just as a quick update on that: I've meanwhile talked to my boss about > this and I am free to decide on a suitable license for publishment > according to whatever makes the most sense and I am allowed to spend > some amount of 'recreational programming time' on this per day. Excellent. If you want your own public Git repository for that, it's nearly there: you can clone the project through Gitorious's web interface, check out from your branch ("git clone"), edit, commit there, and when done, ask me to merge, either as a series of commits, or squashed to one single commit. Given all of the circumstances, I'd say MSS GmbH also deserves mention in the changelogs; I'd propose that you edit a NEWS file entry accordingly and propose a wording you feel adequate and that is in line with your regular PR policies. -- Matthias Andree |