From: Frederic M. <fre...@wo...> - 2006-05-18 15:50:57
|
Sunil Shetye wrote: > Quoting from Frederic Marchal's mail on Thu, May 18, 2006: > >> Using pop3, the size of the mail is one additional information that can >> be used to ensure it is not the same message with the same UID. The list >> command returns it and fetchmail does use it during the communication. I >> expect it would be much less likely that a UID is recycled on a message >> with the exact same size. >> >> If a message is using the same UID as a message we know should have been >> deleted, then list its size and compare it. If it is the same, it can be >> deleted again. If not, it is a new message and should be downloaded. >> >> Does it make sense ? >> > > There are a few objections to this: > > - fetchmail currently does not store the size in the fetchids file, so > doing this will require a modification of the fetchids file format. > For which a change was proposed if I remember correctly... It was even suggested to use a database such as sqlite. I admit I haven't followed the status of that proposition. Beside, there is no hurry. It doesn't have to be a patch within two days. It may be part of a major version change just to make fetchmail even better :-) > - fetchmail does not get the size of all mails right now (in the > default setup), only of the mails it is going to download. Your > suggestion would imply that fetchmail will have to get UIDs as well > as sizes of all mails for comparison. This would imply a lot of > delay in getting mails, especially when 'keep' is on. > Only for the mails with a known UID for which fetchmail sent a delete command in a previous poll (that should not occur too often because most of the time the mails will effectively be deleted on the server when fetchmail closes the connection). It would only serve to find out if the mail is different and should be deleted or downloaded. For an unknown UID or the UID of a mail that was left on the server on purpose, the processing would remain the same. The fetchids file may even be purged of the old UID when the next poll confirms the mails are really gone. A recycled UID coming afterwards would then be seen as a new mail without any further comparison. > - Some mailservers keep the flags of a mail in the mail itself by > adding a header like Status:. So, the size of a mail may actually > change when it turns from 'new' to 'old'. Due to size mismatch, > mails from such mailservers will get downloaded again. > You got me there :-) In that case, the mail would be downloaded a second time and deleted. If the deletion fails again, the mail may be downloaded once more if its size keeps changing or it may eventually be deleted without any further download if the size is not changed any more. If the download is the action that causes the connection drop (such as a proxy with a virus scanner taking too much time and exceeding the timeout of the server), it will be downloaded twice and deleted the third time. And this, only if the connection is dropped after the delete command is sent AND the size of the mail is changed by the server on the next poll. The worst case occurs if the size of the mail keeps changing at every poll. I have no solution here. It's not funny at all to have to deal with broken servers :-( > - Some mailservers misreport sizes! I remember one such server being > reported: > > <http://lists.ccil.org/pipermail/fetchmail-friends/2002-January/005572.html> > > It depends on the problem with the size. If the reported size is always the same, there is no problem. Now, if the reported size is a random function, there is no way it can work. In that case, when the mails fail to be deleted, they are downloaded again over and over until the delete commands succeed. On the other hand, for all those who have a good server reporting the correct sizes, there will be an improvement and the mails will not be duplicated when the connection drop. Now, you may argue that it is a lot of programming for the sole benefit of not deleting unread mails when the connection drop... I have nothing to say against that. I'm not likely to be the one to program it although I really wish I could. Note that this is only a solution I propose to solve the problem of the mails being either duplicated, deleted without delivery or left forever on the server when a connection drop. I'm happy with my current configuration which work fine with pop3 + uidl and a server that doesn't recycle the UID too quickly and a connection quite reliable enough. Frederic |