Busy loop and exceptions when calendar server inaccessible
Brought to you by:
mguessan
When trying to synchronize a calendar while the server is not accessible,
exceptions are thrown all the time, causing very high CPU load.
This happens both when the network is down (causing UnkownHostException)
and when the wrong server is accessed (causing NullPointerException, log attached).
The latter happens when my Laptop using OpenDNS is connected to the Internet,
while the calendar server of my company is accessible from the company intranet only,
and OpenDNS redirects the domain of the server to a default error domain.
Severity: Highest, since this makes DavMail unusable, as it blocks my machine\'s
desktop server Xorg (running Gnome 2 on Ubuntu 12.03).
davmail-NullPointerException.log
Highest prioritiy
I also noticed this. Unfortunately this is a Lightning regression: the 503 service unavailable returned by DavMail is correct but Lighting tries again indefinitely.
Has this severe misbehavior been reported as Lightning bug?
In any case, to alleviate the situation, DavMail could do at least one of the following:
1. Not report errors of the given kind via the GUI. In this way, the desktop is not blocked
and the unacceptable CPU load is confined to Thunderbird/Lightning and DavMail.
2. Delay the 503 error response in the given case(e) by sleeping for, e.g., 1000ms.
- You can disable graphical notifications in DavMail settings
- Please go ahead and submit the issue on Lightning tracker, and let us know the answer
> You can disable graphical notifications in DavMail settings
Doing this, the desktop is more responsive, but still I get 50% needless CPU load until killing DavMail :-(
> Please go ahead and submit the issue on Lightning tracker, and let us know the answer
I added a respective comment to a existing but report apparently caused by the same bug: https://bugzilla.mozilla.org/show_bug.cgi?id=746962#c3
Given the typically very slow responsiveness of the Thunderbird developers until a fix is available, the second workaround at the DavMail side that I suggested in my last comment (adding a delay before returning error 503) would still be highly appreciated.
Fixed in commit 2091: workaround for Lightning bug, return 403 instead of 503 on server unavailable
Thanks for your changes providing a workaround for this Thunderbird Lightning bug.
Yet with the changes you made, unavailable calendars are not properly flagged anymore by Lightning,
that is, there is no indication anymore (with the yellow warning symbol) when a calendar not available.
I still believe that (instead of changing the error code returned) it is best to add a delay of, e.g., 1 second.
I just tried it out and it appears to work fine, without the bad side effect just described.
Since I cannot add an attachment here (why?), I paste in this comment the respective patch:
Index: src/java/davmail/caldav/CaldavConnection.java
--- src/java/davmail/caldav/CaldavConnection.java (revision 2116)
+++ src/java/davmail/caldav/CaldavConnection.java (working copy)
@@ -172,7 +172,8 @@
} catch (DavMailAuthenticationException e) {
if (Settings.getBooleanProperty("davmail.enableKerberos")) {
// authentication failed in Kerberos mode => not available
- sendErr(HttpStatus.SC_FORBIDDEN, "Kerberos authentication failed");
+ try { Thread.sleep(1000); } catch (InterruptedException ie) {} // workaround for Lightning bug: sleep for 1 second
+ sendErr(HttpStatus.SC_SERVICE_UNAVAILABLE, "Kerberos authentication failed");
} else {
sendUnauthorized();
}
@@ -1121,7 +1122,8 @@
} else if (e instanceof HttpPreconditionFailedException) {
sendErr(HttpStatus.SC_PRECONDITION_FAILED, message);
} else {
- sendErr(HttpStatus.SC_FORBIDDEN, message);
+ try { Thread.sleep(1000); } catch (InterruptedException ie) {} // workaround for Lightning bug: sleep for 1 second
+ sendErr(HttpStatus.SC_SERVICE_UNAVAILABLE, message);
}
}
At least it doesn't loop... the root cause is on the Lightning side, no reason to implement a complex workaround in DavMail.
=> will need to revert the 403 response as soon as the Lightning but is fixed
Yes, but using a wrong error response is definitely misleading, and adding a delay is not a complex workaround.
And I do not think that it is a performance problem in this specific case.
Let us see how many years it will take until the Lightning bug is fixed. So far, I did not even get any response on it.
For all these reasons, I will stick to the patch I proposed.
Well, let's just try to help:
https://bugzilla.mozilla.org/show_bug.cgi?id=874654
=> create a new bug report with attached patch
Could you please try this patch on your Thunderbird instance ?
Note that I reverted the 403 patch (back to 503) and merged your wait patch to cope with unpatched Lightning
Thanks Mickaël! I can confirm that the current workaround is effective,
and that when removing the workaround and using the fix you provided
for Lightning bug 874654, the problem reported here is solved :-)