#498 netatalk 3.0.2 "INTERNAL ERROR: Signal 11" on opening file in share

None
closed
nobody
None
1
2014-08-19
2013-02-27
Ali J.
No

on a ubuntu 12.04 machine - shares reachable - can put files in but upon opening them the finder hangs and the netatalk log shows an endless amount of these:

Feb 27 15:44:58.261491 afpd[32110] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:44:58.263099 afpd[32110] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
Feb 27 15:44:58.263384 afpd[31169] {server_child.c:268} (N:Default): Reconnect: no child[31992]
Feb 27 15:45:03.267090 afpd[32110] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
Feb 27 15:45:03.304952 afpd[32110] {fault.c:123} (S:Default): ===============================================================
Feb 27 15:45:03.305046 afpd[32110] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 32110 (3.0.2)
Feb 27 15:45:03.305062 afpd[32110] {fault.c:125} (S:Default): ===============================================================
Feb 27 15:45:03.311779 afpd[32110] {fault.c:96} (S:Default): PANIC: internal error
Feb 27 15:45:03.312319 afpd[32110] {fault.c:97} (S:Default): BACKTRACE: 9 stack frames:
Feb 27 15:45:03.312474 afpd[32110] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x1f) [0x7fed7ce891cf]
Feb 27 15:45:03.312489 afpd[32110] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x352fe) [0x7fed7ce892fe]
Feb 27 15:45:03.312504 afpd[32110] {fault.c:103} (S:Default): #2 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fed7bfbf4a0]
Feb 27 15:45:03.312518 afpd[32110] {fault.c:103} (S:Default): #3 /usr/local/sbin/afpd(cname+0x40f) [0x4172bf]
Feb 27 15:45:03.312553 afpd[32110] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(afp_getfildirparams+0xe1) [0x4213f1]
Feb 27 15:45:03.312568 afpd[32110] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_over_dsi+0x382) [0x40c522]
Feb 27 15:45:03.312583 afpd[32110] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(main+0xcfe) [0x40a2ce]
Feb 27 15:45:03.312597 afpd[32110] {fault.c:103} (S:Default): #7 /lib/x86_64-linux-gnu/libc.so.6(libc_start_main+0xed) [0x7fed7bfaa76d]
Feb 27 15:45:03.312612 afpd[32110] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd() [0x40a3f5]
Feb 27 15:45:03.615779 afpd[32176] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:45:03.625717 afpd[32176] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
Feb 27 15:45:08.633751 afpd[32176] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
Feb 27 15:45:08.987162 afpd[32176] {fault.c:123} (S:Default): ===============================================================
Feb 27 15:45:08.987761 afpd[32176] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 32176 (3.0.2)
Feb 27 15:45:08.987921 afpd[32176] {fault.c:125} (S:Default): ===============================================================
Feb 27 15:45:08.995297 afpd[32176] {fault.c:96} (S:Default): PANIC: internal error
Feb 27 15:45:08.996193 afpd[32176] {fault.c:97} (S:Default): BACKTRACE: 9 stack frames:
Feb 27 15:45:08.996323 afpd[32176] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x1f) [0x7fed7ce891cf]
Feb 27 15:45:08.996437 afpd[32176] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x352fe) [0x7fed7ce892fe]
Feb 27 15:45:08.996541 afpd[32176] {fault.c:103} (S:Default): #2 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fed7bfbf4a0]
Feb 27 15:45:08.996641 afpd[32176] {fault.c:103} (S:Default): #3 /usr/local/sbin/afpd(cname+0x40f) [0x4172bf]
Feb 27 15:45:08.996754 afpd[32176] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(afp_getfildirparams+0xe1) [0x4213f1]
Feb 27 15:45:08.996857 afpd[32176] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_over_dsi+0x382) [0x40c522]
Feb 27 15:45:08.996986 afpd[32176] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(main+0xcfe) [0x40a2ce]
Feb 27 15:45:08.997118 afpd[32176] {fault.c:103} (S:Default): #7 /lib/x86_64-linux-gnu/libc.so.6(
libc_start_main+0xed) [0x7fed7bfaa76d]
Feb 27 15:45:08.997220 afpd[32176] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd() [0x40a3f5]
Feb 27 15:45:09.290062 afpd[32222] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
Feb 27 15:45:09.292607 afpd[32222] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect

Discussion

1 2 > >> (Page 1 of 2)
  • Ralph Böhme
    Ralph Böhme
    2013-02-27

    • Ali J.
      Ali J.
      2013-02-27

      Ralph: should I install the source version and try again? or do you just need the debug output?

      EDIT: building with debug flags right now

       
      Last edit: Ali J. 2013-02-27
  • Ralph Böhme
    Ralph Böhme
    2013-02-27

    ... "We need a stack-backtrace (SBT) from a corefile with debugging symbols." ...

     
    • Ali J.
      Ali J.
      2013-03-17

      ok - working on it - compiled netatalk with the debug flags, upped the ulimits:

      root@ubuntu:~# ulimit -a
      core file size (blocks, -c) unlimited
      data seg size (kbytes, -d) unlimited
      scheduling priority (-e) 0
      file size (blocks, -f) unlimited
      pending signals (-i) 7799
      max locked memory (kbytes, -l) 64
      max memory size (kbytes, -m) unlimited
      open files (-n) 1024
      pipe size (512 bytes, -p) 8
      POSIX message queues (bytes, -q) 819200
      real-time priority (-r) 0
      stack size (kbytes, -s) 8192
      cpu time (seconds, -t) unlimited
      max user processes (-u) 7799
      virtual memory (kbytes, -v) unlimited
      file locks (-x) unlimited

      but I'm not getting crash dumps in /var/crash

      attaching with gdb with the pid is tough since the process always crashes and doesn't stay long enough for me to attach, attaching to afpd directly doesn't give me a stack:

      root@ubuntu:~# gdb /usr/local/sbin/afpd
      GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
      Copyright (C) 2012 Free Software Foundation, Inc.
      License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law. Type "show copying"
      and "show warranty" for details.
      This GDB was configured as "x86_64-linux-gnu".
      For bug reporting instructions, please see:
      http://bugs.launchpad.net/gdb-linaro/...
      Reading symbols from /usr/local/sbin/afpd...done.
      (gdb) bt full
      No stack.
      (gdb) q

       
      Last edit: Ali J. 2013-03-17
  • Ali J.
    Ali J.
    2013-03-17

    additonal infos to reproduce this:

    ubuntu 12.04
    netatalk 3.0.2
    using symlinks
    and a file located in:

    /home/username/folder/symlink(->somewhere/else)/file

    just opening the file freezes the opening program in os x, the program can only be force exited

     
  • Ali J.
    Ali J.
    2013-03-17

    nvm I think I got them crashdumps, additional info: on this machine I additionally get a moving error:

    Mar 17 15:17:49.797604 afpd[3444] {auth.c:242} (N:AFPDaemon): AFP3.3 Login by admin
    Mar 17 15:17:49.798168 afpd[3444] {auth.c:591} (N:AFPDaemon): afp_disconnect: trying primary reconnect
    Mar 17 15:17:49.798258 afpd[2488] {server_child.c:268} (N:Default): Reconnect: no child[3429]
    Mar 17 15:17:54.798542 afpd[3444] {auth.c:624} (E:AFPDaemon): afp_disconnect: primary reconnect failed
    Mar 17 15:17:54.814567 afpd[3444] {unix.c:97} (N:Default): run_cmd("mv"): status: 1
    Mar 17 15:17:54.814627 afpd[3444] {desktop.c:218} (E:AFPDaemon): moving .AppleDesktop from "/home/protonet/dashboard/shared/files/system_users/admin/.AppleDesktop" to "/usr/local/var/netatalk/CNID//admin's home//.AppleDesktop" failed
    Mar 17 15:17:54.816307 afpd[3444] {fault.c:123} (S:Default): ===============================================================
    Mar 17 15:17:54.816345 afpd[3444] {fault.c:124} (S:Default): INTERNAL ERROR: Signal 11 in pid 3444 (3.0.2)
    Mar 17 15:17:54.816353 afpd[3444] {fault.c:125} (S:Default): ===============================================================
    Mar 17 15:17:54.817113 afpd[3444] {fault.c:96} (S:Default): PANIC: internal error
    Mar 17 15:17:54.817128 afpd[3444] {fault.c:97} (S:Default): BACKTRACE: 11 stack frames:
    Mar 17 15:17:54.817136 afpd[3444] {fault.c:103} (S:Default): #0 /usr/local/lib/libatalk.so.3(netatalk_panic+0x26) [0x7f7d5ba6a9e0]
    Mar 17 15:17:54.817143 afpd[3444] {fault.c:103} (S:Default): #1 /usr/local/lib/libatalk.so.3(+0x42bfa) [0x7f7d5ba6abfa]
    Mar 17 15:17:54.817150 afpd[3444] {fault.c:103} (S:Default): #2 /usr/local/lib/libatalk.so.3(+0x42c4e) [0x7f7d5ba6ac4e]
    Mar 17 15:17:54.817158 afpd[3444] {fault.c:103} (S:Default): #3 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f7d5ab934a0]
    Mar 17 15:17:54.817165 afpd[3444] {fault.c:103} (S:Default): #4 /usr/local/sbin/afpd(cname+0x859) [0x41af08]
    Mar 17 15:17:54.817172 afpd[3444] {fault.c:103} (S:Default): #5 /usr/local/sbin/afpd(afp_getfildirparams+0x12a) [0x428235]
    Mar 17 15:17:54.817179 afpd[3444] {fault.c:103} (S:Default): #6 /usr/local/sbin/afpd(afp_over_dsi+0x741) [0x40c205]
    Mar 17 15:17:54.817186 afpd[3444] {fault.c:103} (S:Default): #7 /usr/local/sbin/afpd() [0x42f338]
    Mar 17 15:17:54.817193 afpd[3444] {fault.c:103} (S:Default): #8 /usr/local/sbin/afpd(main+0xb33) [0x42eec4]

     
    Attachments
  • Ali J.
    Ali J.
    2013-03-17

    ALso, I've got a lot more coredumps ;)

     
  • Ali J.
    Ali J.
    2013-03-18

    [New LWP 3444]
    
    warning: Can't read pathname for load map: Input/output error.
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    Core was generated by `/usr/local/sbin/afpd -d -F /usr/local/etc/afp.conf'.
    Program terminated with signal 6, Aborted.
    #0  0x00007f7d5ab93425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
    (gdb) bt full
    #0  0x00007f7d5ab93425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
    No symbol table info available.
    #1  0x00007f7d5ab96b8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
    No symbol table info available.
    #2  0x00007f7d5ba6ac39 in fault_report (sig=11) at fault.c:139
            counter = 1
    #3  0x00007f7d5ba6ac4e in sig_fault (sig=11) at fault.c:147
    No locals.
    #4  <signal handler called>
    No symbol table info available.
    #5  0x000000000041af08 in cname (vol=0xbc9ef0, dir=0xbc9470, 
        cpath=0x7fffaecf59b0) at directory.c:1257
            path = "enable2start_foerderung_excel.xlsx", '\000' <repeats 4062 times>
            ret = {m_type = 3, 
              m_name = 0x64b8e0 "enable2start_foerderung_excel.xlsx", 
              u_name = 0x66181c "enable2start_foerderung_excel.xlsx", id = 0, 
              d_dir = 0x0, st_valid = 1, st_errno = 0, st = {st_dev = 2049, 
                st_ino = 1322925, st_nlink = 1, st_mode = 33204, st_uid = 8000, 
                st_gid = 1000, __pad0 = 0, st_rdev = 0, st_size = 31115, 
                st_blksize = 4096, st_blocks = 72, st_atim = {tv_sec = 1363529071, 
                  tv_nsec = 891734436}, st_mtim = {tv_sec = 1319454817, 
                  tv_nsec = 0}, st_ctim = {tv_sec = 1363527869, 
    ---Type <return> to continue, or q <return> to quit---
                  tv_nsec = 509496000}, __unused = {0, 0, 0}}}
            cdir = 0xff9
            data = 0x7f7d5be6a042 ""
            p = 0x64b8ff "lsx"
            len = 0
            hint = 134217987
            len16 = 7936
            size = 7
            toUTF8 = 0
    #6  0x0000000000428235 in afp_getfildirparams (obj=0x65e4c0, 
        ibuf=0x7f7d5be6a042 "", ibuflen=50, rbuf=0xbb0fa0 "", rbuflen=0xbb2fa0)
        at filedir.c:76
            st = 0x7f7d5ba54aea
            vol = 0xbc9ef0
            dir = 0xbc9470
            did = 369098752
            ret = 32767
            buflen = 12257456
            fbitmap = 59711
            dbitmap = 41791
            vid = 256
            s_path = 0x7fffaecf5a30
    #7  0x000000000040c205 in afp_over_dsi (obj=0x65e4c0) at afp_dsi.c:626
    ---Type <return> to continue, or q <return> to quit---
            dsi = 0xbb08b0
            rc_idx = 11
            err = 0
            cmd = 2
            function = 34 '"'
            flag = 1
    #8  0x000000000042f338 in dsi_start (obj=0x65e4c0, dsi=0xbb08b0, 
        server_children=0xbac0a0) at main.c:509
            child = 0x0
    #9  0x000000000042eec4 in main (ac=4, av=0x7fffaecf5d68) at main.c:429
            i = 0
            sv = {__sigaction_handler = {sa_handler = 0x42dee1 <afp_goaway>, 
                sa_sigaction = 0x42dee1 <afp_goaway>}, sa_mask = {__val = {90625, 
                  0 <repeats 15 times>}}, sa_flags = 268435456, sa_restorer = 0}
            sigs = {__val = {74241, 0 <repeats 15 times>}}
            ret = 1
            child = 0xbb4d50
            recon_ipc_fd = 0
            pid = 0
            saveerrno = 10
    
     
    Last edit: Ralph Böhme 2013-03-27
  • Ralph Böhme
    Ralph Böhme
    2013-03-27

    The crash happens when dereferencing curdir which was invalidated and set to NULL in cname_mtouname().
    cname_mtouname() ends up calling dirlookup() when demangling mangled (name#CNID) names. There the curdir is invalidated and freed because the dircache detected a dev/inode cache difference and evicted the object from the cache. Unfortunately that was our "curdir".

    Possible fixes:
    - use ostat() in the dircache, this avoids the cache eviction in the first place
    - in cname_mtouname() check whether curdir got NULL and reassign

     
1 2 > >> (Page 1 of 2)