Re: [courier-users] imapd locking against itself? (long post, sorry)
Brought to you by:
mrsam
|
From: Sam V. <mr...@co...> - 2011-10-22 14:55:58
|
Need Coffee writes:
> In running the testsuite I found some failures; rfc2045/reformime segfaults
> because in /usr/share/locale, "en_US.utf-8" doesn't exist, while
> "en_US.UTF-8" (caps) does. OS bug?
Only in the sense that this if this has changed, there are no arrangements
for backwards compatibility. I also looked through glibc sources. glibc
accepts a locale codeset specified either in uppercase or lowercase. This
doesn't matter in glibc's case. But, reformime shouldn't segfault anyway.
> If I cd into the imap subdirectory and run 'gmake testsuite-imap' on
> the NFS mounted area, it proceeds fine until test T014:
>
> 001313 T013 OK LOGOUT completed
> 001314 * PREAUTH Ready.
> 001315 * BYE [ALERT] Fatal error: Invalid argument
> 001316 * PREAUTH Ready.
> 001317 * BYE [ALERT] Fatal error: Invalid argument
> 001318 * PREAUTH Ready.
> 001319 * BYE [ALERT] Fatal error: Invalid argument
> 001320 * PREAUTH Ready.
> 001321 * BYE [ALERT] Fatal error: Invalid argument
> 001322 * PREAUTH Ready.
> 001323 * BYE [ALERT] Fatal error: Invalid argument
>
> Is this in any way related?
Yes. Something in the environment is causing a fatal error. It's not really
possible to determine what it is, without debugging it further.
But I'm guessing that this also happens when the imap server has already
created a lock file, then it gets killed, and on a busy server, the process
id might get recycled quickly enough before some other process removes the
stale lock. The logic that checks for stale locks won't remove the lock file
if the process ID recorded in the lock file still exists, but it's the same
process ID.
That's easy enough to fix, but you still have a problem that's aborting the
IMAP processes in the first place.
This patch should fix both the imap process getting stuck because of a lock
file left over from an aborted process with the same pid, and the reformime
segfault. But you still have an unresolved root cause for the aborted
process, that's yet to be determined.
Index: unicode/unicode.c
===================================================================
--- unicode/unicode.c (revision 146)
+++ unicode/unicode.c (working copy)
@@ -46,7 +46,7 @@
#if HAVE_LOCALE_H
#if HAVE_SETLOCALE
old_locale=setlocale(LC_ALL, "");
- locale_cpy=strdup(old_locale);
+ locale_cpy=old_locale ? strdup(old_locale):NULL;
#if USE_LIBCHARSET
chset = locale_charset();
#elif HAVE_LANGINFO_CODESET
Index: liblock/mail.c
===================================================================
--- liblock/mail.c (revision 146)
+++ liblock/mail.c (working copy)
@@ -315,7 +315,8 @@
if (readid(idbuf, fd) == 0 && (p=getpidid(idbuf, myidbuf)))
{
- if (kill(p, 0) < 0 && errno == ESRCH)
+ if (p == getpid() /* Possibly recycled PID */
+ || kill(p, 0) < 0 && errno == ESRCH)
{
close(fd);
if (unlink(dotlock) == 0)
|