I've been tearing my hair out for weeks trying to figure out why one of my Macs kept failing to Time Machine back up to my Gentoo box running netatalk. I could manually mount shares, but when TM decided to do an automatic backup, there would be an authentication failure. This was the failure message I would get in /var/log/messages:
Jan 16 12:49:42 compute0 afpd: pam_unix(netatalk:auth): authentication failure; logname= uid=0 euid=0 tty=afpd ruser= rhost=laura-millers-imac user=millerti
I exhausted every reasonable fix on the Mac side, to no avail, when I decided to enable debug logging (-setuplog "AFPDaemon LOG_DEBUG") on the server to see what was going on. Then the problem went away. I left debugging on, waiting for the problem to manifest again, but it never did. So I turned off debug logging in netatalk, and the authentication problem immediately returned.
Sounds like a Catch 22. It appears that there's a bug in netatalk that is causing authentication to fail. The regular auth-fail message isn't informative, so I turn on more detailed logging, which makes the problem disappear. I'm sure you can imagine my frustration.
Notes:
Leaving debug logging on is not an acceptable work-around, because it floods my /var/log/messages with messages, rapidly using up disk space and drowning out other messages.
This is a link to the bug report I filed for Gentoo:
http://bugs.gentoo.org/show_bug.cgi?id=351863
I'd dissect a network trace of the AFP traffic.
I wouldn't even know where to begin with dissecting network traffic.
However, this kind of issue we're seeing is common with uninitialized auto variables. Somewhere, there's a debug [f]printf that's changing an uninitialized variable, which is altering afpd's behavior. E.g. debug message followed by call to a function with the bug. When you print the message, the stack contains values different from without the message, altering the initial value of the local variable in the second function call. Isn't it possible to get the compiler to warn about this?
http://www.wireshark.org/
http://developer.apple.com/library/mac/#documentation/Networking/Reference/AFP_Reference/Reference/reference.html
http://developer.apple.com/library/mac/#documentation/Networking/Conceptual/AFP/Introduction/Introduction.html
I'd almost swear that we have zero LOG statements that have a assignment as arg to the printf statement.
Of course, you can use CFLAGS="-O1 -Wuninitialized" with gcc. I've just done that, without any findings relevant here.
--
The code is free, so if it breaks both halves are yours
I just wanted to note that these mysterious authentication failures have spread to other systems. There may be something special about my iMac that makes this more likely, but just today, I had another intermittent auth failure for my MBP.
These auth failures are random and aren't specific to a config problem on one Mac, which suggests to me that the Macs are sending the right data, but afpd is failing auth anyway. Of course, I could be wrong.
Your links regarding wireshark and such aren't very useful to me, I'm afraid. I don't have time or energy to learn about that. If you have specific instructions, I'd be happy to follow them, but family, work, and school leave me no free time for the week or so long distraction this would entail.
If we can't fix this, I'll just switch back to samba until it resolves itself. Unfortunately, that comes with its own reliability problems, but at least it never failed auth.
Thanks.
x86_64-pc-linux-gnu-gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../sys -I../../include "-D_U_=attribute((unused))" -O1 -ggdb -Wuninitialized -Werror=uninitialized -I../../sys -MT socket.lo -MD -MP -MF .deps/socket.Tpo -c socket.c -fPIC -DPIC -o .libs/socket.o
socket.c: In function ‘readt’:
socket.c:85: error: ‘len’ may be used uninitialized in this function
x86_64-pc-linux-gnu-gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../sys -D_PATH_AFPDDEFVOL=\"/etc/netatalk/AppleVolumes.default\" -D_PATH_AFPDSYSVOL=\"/etc/netatalk/AppleVolumes.system\" -D_PATH_AFPDPWFILE=\"/etc/netatalk/afppasswd\" -D_PATH_AFPDCONF=\"/etc/netatalk/afpd.conf\" -D_PATH_AFPDSIGCONF=\"/etc/netatalk/afp_signature.conf\" -D_PATH_AFPDUAMPATH=\"/usr/lib64/netatalk/\" -D_PATH_ACL_LDAPCONF=\"/etc/netatalk/afp_ldap.conf\" -DAPPLCNAME -DSERVERTEXT=\"/etc/netatalk/msg/\" -I../../include -D_U_="attribute((unused))" -O1 -ggdb -Wuninitialized -Werror=uninitialized -I../../sys -MT afpd-volume.o -MD -MP -MF .deps/afpd-volume.Tpo -c -o afpd-volume.o
test -f 'volume.c' || echo './'volume.cvolume.c: In function ‘sortextmap’:
volume.c:1029: warning: passing argument 4 of ‘qsort’ from incompatible pointer type
/usr/include/stdlib.h:761: note: expected ‘__compar_fn_t’ but argument is of type ‘int ()(void , void )’
volume.c: In function ‘check_ea_sys_support’:
volume.c:1893: warning: ignoring return value of ‘mktemp’, declared with attribute warn_unused_result
volume.c: In function ‘getextmap’:
volume.c:2278: warning: passing argument 5 of ‘bsearch’ from incompatible pointer type
/usr/include/stdlib.h:755: note: expected ‘__compar_fn_t’ but argument is of type ‘int ()(void , void )’
volume.c: In function ‘stat_vol’:
volume.c:1418: error: ‘bfree’ may be used uninitialized in this function
volume.c:1418: note: ‘bfree’ was declared here
volume.c:1418: error: ‘btotal’ may be used uninitialized in this function
volume.c:1418: note: ‘btotal’ was declared here
x86_64-pc-linux-gnu-gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../sys -D_PATH_AFPDDEFVOL=\"/etc/netatalk/AppleVolumes.default\" -D_PATH_AFPDSYSVOL=\"/etc/netatalk/AppleVolumes.system\" -D_PATH_AFPDPWFILE=\"/etc/netatalk/afppasswd\" -D_PATH_AFPDCONF=\"/etc/netatalk/afpd.conf\" -D_PATH_AFPDSIGCONF=\"/etc/netatalk/afp_signature.conf\" -D_PATH_AFPDUAMPATH=\"/usr/lib64/netatalk/\" -D_PATH_ACL_LDAPCONF=\"/etc/netatalk/afp_ldap.conf\" -DAPPLCNAME -DSERVERTEXT=\"/etc/netatalk/msg/\" -I../../include -D_U_="attribute((unused))" -O1 -ggdb -Wuninitialized -Werror=uninitialized -I../../sys -MT afpd-directory.o -MD -MP -MF .deps/afpd-directory.Tpo -c -o afpd-directory.o
test -f 'directory.c' || echo './'directory.cdirectory.c: In function ‘setdirparams’:
directory.c:2053: error: ‘cdate’ may be used uninitialized in this function
directory.c:2053: error: ‘bdate’ may be used uninitialized in this function
directory.c:2054: error: ‘owner’ may be used uninitialized in this function
directory.c:2054: error: ‘group’ may be used uninitialized in this function
directory.c:2055: error: ‘ashort’ may be used uninitialized in this function
directory.c:2062: error: ‘upriv’ may be used uninitialized in this function
x86_64-pc-linux-gnu-gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../sys -D_PATH_AFPDDEFVOL=\"/etc/netatalk/AppleVolumes.default\" -D_PATH_AFPDSYSVOL=\"/etc/netatalk/AppleVolumes.system\" -D_PATH_AFPDPWFILE=\"/etc/netatalk/afppasswd\" -D_PATH_AFPDCONF=\"/etc/netatalk/afpd.conf\" -D_PATH_AFPDSIGCONF=\"/etc/netatalk/afp_signature.conf\" -D_PATH_AFPDUAMPATH=\"/usr/lib64/netatalk/\" -D_PATH_ACL_LDAPCONF=\"/etc/netatalk/afp_ldap.conf\" -DAPPLCNAME -DSERVERTEXT=\"/etc/netatalk/msg/\" -I../../include -D_U_="attribute((unused))" -O1 -ggdb -Wuninitialized -Werror=uninitialized -I../../sys -MT afpd-file.o -MD -MP -MF .deps/afpd-file.Tpo -c -o afpd-file.o
test -f 'file.c' || echo './'file.cfile.c: In function ‘setfilparams’:
file.c:789: error: ‘fdType’ may be used uninitialized in this function
file.c:790: error: ‘ashort’ may be used uninitialized in this function
file.c:792: error: ‘upriv’ may be used uninitialized in this function
file.c:804: error: ‘cdate’ may be used uninitialized in this function
file.c:804: error: ‘bdate’ may be used uninitialized in this function
x86_64-pc-linux-gnu-gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../sys -D_PATH_AFPDDEFVOL=\"/etc/netatalk/AppleVolumes.default\" -D_PATH_AFPDSYSVOL=\"/etc/netatalk/AppleVolumes.system\" -D_PATH_AFPDPWFILE=\"/etc/netatalk/afppasswd\" -D_PATH_AFPDCONF=\"/etc/netatalk/afpd.conf\" -D_PATH_AFPDSIGCONF=\"/etc/netatalk/afp_signature.conf\" -D_PATH_AFPDUAMPATH=\"/usr/lib64/netatalk/\" -D_PATH_ACL_LDAPCONF=\"/etc/netatalk/afp_ldap.conf\" -DAPPLCNAME -DSERVERTEXT=\"/etc/netatalk/msg/\" -I../../include -D_U_="attribute((unused))" -O1 -ggdb -Wuninitialized -Werror=uninitialized -I../../sys -MT afpd-status.o -MD -MP -MF .deps/afpd-status.Tpo -c -o afpd-status.o
test -f 'status.c' || echo './'status.cstatus.c: In function ‘set_signature’:
status.c:698: warning: cast from pointer to integer of different size
status.c:698: warning: cast from pointer to integer of different size
status.c:559: error: ‘fp’ may be used uninitialized in this function
There were others, but they had comments indicating that they were known. What's now running on my system has all of these uninitialized variables set to zero. We'll see if that helps.
So, what I did was initialize all of these uninitialized variables, along with compiling with -O1 and no other optimizations, and then I installed that build of netatalk. It's still unreliable. I've had one Mac never fail to authenticate, one that almost always fails to authenticate, and one that's never failed. With this build installed, the one that usually fails seems to fail less often, but I suspect that's a meaningless result.
I'll eventually be able to look into this wireshark thing, but in the mean time, is there anything else I can do? Can authentication be instrumented in some way so as to report details of name and password and exactly why auth failed? That sort of thing should be logged anyhow, but it seems to be incomplete.
I've been running for some time with a netatalk server where the uninitialized variables I mentioned are all initialized and the optimization level is -O1. Now, instead of usually failing authentication, authentication only fails SOMETIMES.
There's definitely a nondeterministic bug here that's affected by changing things like optimization flags, variable initialization, and logging.
Is this bug going to get some attention, or are you going to wait around until I have time to learn wireshark, which may be never?
Until a custoumer with a support contract is bitten by this, I won't be able to further look into this.
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).