on a ubuntu 12.04 machine - shares reachable - can put files in but upon opening them the finder hangs and the netatalk log shows an endless amount of these:
Feb 27 15:44:58.261491 afpd[32110] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:44:58.263099 afpd[32110] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
Feb 27 15:44:58.263384 afpd[31169] {server_child.c:268} (N:Default): Reconnect: no child[31992]
Feb 27 15:45:03.267090 afpd[32110] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
Feb 27 15:45:03.304952 afpd[32110] {fault.c:123} (S:Default): ===============================================================
Feb 27 15:45:03.305046 afpd[32110] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 32110 (3.0.2)
Feb 27 15:45:03.305062 afpd[32110] {fault.c:125} (S:Default): ===============================================================
Feb 27 15:45:03.311779 afpd[32110] {fault.c:96} (S:Default): PANIC: internal error
Feb 27 15:45:03.312319 afpd[32110] {fault.c:97} (S:Default): BACKTRACE: 9 stack frames:
Feb 27 15:45:03.312474 afpd[32110] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x1f) [0x7fed7ce891cf]
Feb 27 15:45:03.312489 afpd[32110] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x352fe) [0x7fed7ce892fe]
Feb 27 15:45:03.312504 afpd[32110] {fault.c:103} (S:Default): #2 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fed7bfbf4a0]
Feb 27 15:45:03.312518 afpd[32110] {fault.c:103} (S:Default): #3 /usr/local/sbin/afpd(cname+0x40f) [0x4172bf]
Feb 27 15:45:03.312553 afpd[32110] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(afp_getfildirparams+0xe1) [0x4213f1]
Feb 27 15:45:03.312568 afpd[32110] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_over_dsi+0x382) [0x40c522]
Feb 27 15:45:03.312583 afpd[32110] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(main+0xcfe) [0x40a2ce]
Feb 27 15:45:03.312597 afpd[32110] {fault.c:103} (S:Default): #7 /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xed) [0x7fed7bfaa76d]
Feb 27 15:45:03.312612 afpd[32110] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd() [0x40a3f5]
Feb 27 15:45:03.615779 afpd[32176] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:45:03.625717 afpd[32176] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
Feb 27 15:45:08.633751 afpd[32176] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
Feb 27 15:45:08.987162 afpd[32176] {fault.c:123} (S:Default): ===============================================================
Feb 27 15:45:08.987761 afpd[32176] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 32176 (3.0.2)
Feb 27 15:45:08.987921 afpd[32176] {fault.c:125} (S:Default): ===============================================================
Feb 27 15:45:08.995297 afpd[32176] {fault.c:96} (S:Default): PANIC: internal error
Feb 27 15:45:08.996193 afpd[32176] {fault.c:97} (S:Default): BACKTRACE: 9 stack frames:
Feb 27 15:45:08.996323 afpd[32176] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x1f) [0x7fed7ce891cf]
Feb 27 15:45:08.996437 afpd[32176] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x352fe) [0x7fed7ce892fe]
Feb 27 15:45:08.996541 afpd[32176] {fault.c:103} (S:Default): #2 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fed7bfbf4a0]
Feb 27 15:45:08.996641 afpd[32176] {fault.c:103} (S:Default): #3 /usr/local/sbin/afpd(cname+0x40f) [0x4172bf]
Feb 27 15:45:08.996754 afpd[32176] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(afp_getfildirparams+0xe1) [0x4213f1]
Feb 27 15:45:08.996857 afpd[32176] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_over_dsi+0x382) [0x40c522]
Feb 27 15:45:08.996986 afpd[32176] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(main+0xcfe) [0x40a2ce]
Feb 27 15:45:08.997118 afpd[32176] {fault.c:103} (S:Default): #7 /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xed) [0x7fed7bfaa76d]
Feb 27 15:45:08.997220 afpd[32176] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd() [0x40a3f5]
Feb 27 15:45:09.290062 afpd[32222] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:45:09.292607 afpd[32222] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
http://netatalk.sourceforge.net/wiki/index.php/Developer_Infos#Debugging_process_crashes
Ralph: should I install the source version and try again? or do you just need the debug output?
EDIT: building with debug flags right now
Last edit: Ali J. 2013-02-27
... "We need a stack-backtrace (SBT) from a corefile with debugging symbols." ...
ok - working on it - compiled netatalk with the debug flags, upped the ulimits:
root@ubuntu:~# ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7799
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7799
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
but I'm not getting crash dumps in /var/crash
attaching with gdb with the pid is tough since the process always crashes and doesn't stay long enough for me to attach, attaching to afpd directly doesn't give me a stack:
root@ubuntu:~# gdb /usr/local/sbin/afpd
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
http://bugs.launchpad.net/gdb-linaro/...
Reading symbols from /usr/local/sbin/afpd...done.
(gdb) bt full
No stack.
(gdb) q
Last edit: Ali J. 2013-03-17
additonal infos to reproduce this:
ubuntu 12.04
netatalk 3.0.2
using symlinks
and a file located in:
/home/username/folder/symlink(->somewhere/else)/file
just opening the file freezes the opening program in os x, the program can only be force exited
nvm I think I got them crashdumps, additional info: on this machine I additionally get a moving error:
Mar 17 15:17:49.797604 afpd[3444] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Mar 17 15:17:49.798168 afpd[3444] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
Mar 17 15:17:49.798258 afpd[2488] {server_child.c:268} (N:Default): Reconnect: no child[3429]
Mar 17 15:17:54.798542 afpd[3444] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
Mar 17 15:17:54.814567 afpd[3444] {unix.c:97} (N:Default): run_cmd("mv"): status: 1
Mar 17 15:17:54.814627 afpd[3444] {desktop.c:218} (E:AFPDaemon): moving .AppleDesktop from "/home/protonet/dashboard/shared/files/system_users/admin/.AppleDesktop" to "/usr/local/var/netatalk/CNID//admin's home//.AppleDesktop" failed
Mar 17 15:17:54.816307 afpd[3444] {fault.c:123} (S:Default): ===============================================================
Mar 17 15:17:54.816345 afpd[3444] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 3444 (3.0.2)
Mar 17 15:17:54.816353 afpd[3444] {fault.c:125} (S:Default): ===============================================================
Mar 17 15:17:54.817113 afpd[3444] {fault.c:96} (S:Default): PANIC: internal error
Mar 17 15:17:54.817128 afpd[3444] {fault.c:97} (S:Default): BACKTRACE: 11 stack frames:
Mar 17 15:17:54.817136 afpd[3444] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x26) [0x7f7d5ba6a9e0]
Mar 17 15:17:54.817143 afpd[3444] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x42bfa) [0x7f7d5ba6abfa]
Mar 17 15:17:54.817150 afpd[3444] {fault.c:103} (S:Default): #2 /usr/local/lib/libatalk.so.3(+0x42c4e) [0x7f7d5ba6ac4e]
Mar 17 15:17:54.817158 afpd[3444] {fault.c:103} (S:Default): #3 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f7d5ab934a0]
Mar 17 15:17:54.817165 afpd[3444] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(cname+0x859) [0x41af08]
Mar 17 15:17:54.817172 afpd[3444] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_getfildirparams+0x12a) [0x428235]
Mar 17 15:17:54.817179 afpd[3444] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(afp_over_dsi+0x741) [0x40c205]
Mar 17 15:17:54.817186 afpd[3444] {fault.c:103} (S:Default): #7 /usr/local/sbin/afpd() [0x42f338]
Mar 17 15:17:54.817193 afpd[3444] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd(main+0xb33) [0x42eec4]
ALso, I've got a lot more coredumps ;)
Last edit: Ralph Böhme 2013-03-27
The crash happens when dereferencing curdir which was invalidated and set to NULL in cname_mtouname().
cname_mtouname() ends up calling dirlookup() when demangling mangled (name#CNID) names. There the curdir is invalidated and freed because the dircache detected a dev/inode cache difference and evicted the object from the cache. Unfortunately that was our "curdir".
Possible fixes:
- use ostat() in the dircache, this avoids the cache eviction in the first place
- in cname_mtouname() check whether curdir got NULL and reassign
Proposed fix in [77d1bd102602aad0e6e6a38ac102a4735b9a58b7].
Related
Commit: <Commit _id='77d1bd102602aad0e6e6a38ac102a4735b9a58b7' tree_id='1f205e4d81ef758638b671b66eb7af375e6cf266' committed=I{'date': datetime.datetime(2013, 3, 26, 10, 43, 1), 'email': 'sloowfranklin@gmail.com', 'name': 'Ralph Boehme'} authored=I{'date': datetime.datetime(2013, 3, 26, 5, 58, 44), 'email': 'sloowfranklin@gmail.com', 'name': 'Ralph Boehme'} message='Use ostat in the dircache\n\nFixes a possible crash in cname() where cname_mtouname calls\ndirlookup() where the curdir is freed because the dircache\ndetected a dev/inode cache difference and evicted the object\nfrom the cache.\n\nFixes bug #498.\n' parent_ids=I['2fe9518a30bb4b5c0d568deb172dd32e998f5864'] child_ids=I[] repo_ids=I[ObjectId('503b705bfd48f843173d5fd3')]>