#1135 [mrpeach/tcpserver] crashes

v0.45
open
Martin Peach
None
5
2014-02-24
2014-02-04
Antoine Villeret
No

here is a backtrace of the crashes triggered by the patch attached
still on Ubuntu 12.04 64bit / pd0.45-4 / [tcpserver] build from SVN 17262.

Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7ffff18fd700 (LWP 6363)]
0x00007ffff73b42cc in __libc_send (fd=<optimized out>, buf=<optimized out>, 
    n=<optimized out>, flags=<optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/send.c:33
33  ../sysdeps/unix/sysv/linux/x86_64/send.c: Aucun fichier ou dossier de ce type.
(gdb) thread apply all bt

Thread 1827 (Thread 0x7ffff18fd700 (LWP 6363)):
#0  0x00007ffff73b42cc in __libc_send (fd=<optimized out>, 
    buf=<optimized out>, n=<optimized out>, flags=<optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/send.c:33
#1  0x00007ffff3621a4d in tcpserver_broadcast_thread (arg=0x88b710)
    at tcpserver.c:963
#2  0x00007ffff73ace9a in start_thread (arg=0x7ffff18fd700)
    at pthread_create.c:308
#3  0x00007ffff6ed53fd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7ffff7fd5740 (LWP 4424)):
#0  0x00007ffff6e64558 in __GI___libc_free (mem=0x761d70) at malloc.c:2970
#1  0x00007ffff36211c1 in tcpserver_socketreceiver_free (x=0x761d70)
    at tcpserver.c:318
#2  tcpserver_notify (x=0x7ffff7ec1010) at tcpserver.c:1048
#3  0x00007ffff3621027 in tcpserver_socketreceiver_read (x=0x761d70, fd=14)
    at tcpserver.c:281
#4  0x000000000047b17a in sys_domicrosleep.constprop.3 ()
#5  0x000000000047cf5a in sys_pollgui ()
#6  0x000000000047664e in m_mainloop ()
#7  0x00007ffff6e0276d in __libc_start_main (main=0x411800 <main>, argc=4, 
    ubp_av=0x7fffffffe128, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fffffffe118) at libc-start.c:226
#8  0x0000000000411831 in _start ()
1 Attachments

Discussion

  • Martin Peach
    Martin Peach
    2014-02-08

    On Windows 7 running Pd 0.44.0-extended-20130204 I get:
    tcpserver listening on port 22222
    tcpserver: accepted connection from 127.0.0.1 on socket 772
    tcpserver: accepted connection from 127.0.0.1 on socket 788
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 760
    tcpserver: accepted connection from 127.0.0.1 on socket 752
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 800
    tcpserver: accepted connection from 127.0.0.1 on socket 796
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 788
    tcpserver: accepted connection from 127.0.0.1 on socket 760
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 800
    tcpserver: accepted connection from 127.0.0.1 on socket 748
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 752
    tcpserver: accepted connection from 127.0.0.1 on socket 744
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 796
    tcpserver: accepted connection from 127.0.0.1 on socket 768
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 744
    tcpserver: accepted connection from 127.0.0.1 on socket 760
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: "127.0.0.1" removed from list of clients
    tcpserver: accepted connection from 127.0.0.1 on socket 768
    tcpserver: accepted connection from 127.0.0.1 on socket 804
    tcpserver: "127.0.0.1" removed from list of clients
    ...until I shut it down
    so what I suggest is that's a pathological patch and your machine is just not fast enough to print out all the messages in time.

     
  • hello,

    Shame on me, tcpclient had not been compiled for a while and it's reported to be against pd-0.44 while tcpserver is against pd-0.45. So I rebuild it completely, but got the same issue.

    Assuming my machine is not fast enough to print all the messages in time, I commented lots of the post() and sys_sockerror() calls from tcpserver.c and it still crashes. I also disconnect all gui object updated too fast.
    And here is one backtrace from a pd instance started without GUI:

    Program received signal SIGPIPE, Broken pipe.
    [Switching to Thread 0x7ffff3a63700 (LWP 1650)]
    0x00007ffff73b01d7 in __libc_send (fd=12, buf=0x7ffff3c6a1e0 <byte_buf.6739>, n=1, flags=-1) at ../sysdeps/unix/sysv/linux/x86_64/send.c:32
    
    Thread 3035 (Thread 0x7ffff3a63700 (LWP 1650)):
    #0  0x00007ffff73b01d7 in __libc_send (fd=12, buf=0x7ffff3c6a1e0 <byte_buf.6739>, n=1, flags=-1) at ../sysdeps/unix/sysv/linux/x86_64/send.c:32
    #1  0x00007ffff3a669f2 in tcpserver_broadcast_thread (arg=0x885210) at tcpserver.c:963
    #2  0x00007ffff73a8f6e in start_thread (arg=0x7ffff3a63700) at pthread_create.c:311
    #3  0x00007ffff6ecf9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    
    Thread 1 (Thread 0x7ffff7fc3740 (LWP 30797)):
    #0  0x00007ffff6ec7de3 in select () at ../sysdeps/unix/syscall-template.S:81
    #1  0x000000000047b10f in sys_domicrosleep.constprop.3 ()
    #2  0x0000000000476731 in m_mainloop ()
    #3  0x00007ffff6df6de5 in __libc_start_main (main=0x411800 <main>, argc=5, ubp_av=0x7fffffffe028, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe018) at libc-start.c:260
    #4  0x0000000000411831 in _start ()
    

    I have a lot of these if you need.

    And the diff from my modifications is attached.
    It doesn't removed all the prints, but it is much less verbose.

    FYI, it does crash on Ubuntu 13.10 64bit as well, on a quad core i7-2640M @ 2.8GHz and 8Go RAM, with SSD.

    This issue appears on a big patch which crashes after approximately 1 hour. I made this "pathological" patch just to figure out the issue.

     
    Attachments
  • Martin Peach
    Martin Peach
    2014-02-10

    It seems to be tcpserver crashing.
    Using [print] on the outputs of the tcpclients and tcpserver I can see that the messages from the two tcpclients are combined in the broadcast (so 1 from the first tcpclient and 2 from the other are broadcast as a single 1 2). So tcpserver receives two simultaneous messages, combines them into one and sends it to each tcpclient. The longer message will not trigger a disconnect since it's a list 1 2 instead of a float 1 or 2 so select doesn't trigger the [disconnect(. The disconnect happens through the delay. Possibly there is some overrun happening here?

     
  • Using a [print] on the [tcpserver] output, I don't see combined data.

    Using print on [tcpclient] and [tcpserver] I see client1 receiving 1 and sometimes 2 and client2 receiving 2.

    I also made a new pathological patch, without timeout.
    This patch crashes after few data send.
    It doesn't crash in gdb.
    It's attached.

    I can also trig a crash with this scenario :
    [tcpclient] connects to server, sends some data but doesn't get an answer (I don't know it the server receives the data)
    but after that, the client is still connected, click on message to send data again crashes Pd.

    Any idea ?

     
  • Martin Peach
    Martin Peach
    2014-02-13

    It might work now with the latest commit. I still don't get the same error as you though. It might be a 64-bit issue.

     
  • unfortunately, it's still crashing, I got this :

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7ffff2052700 (LWP 12535)]
    0x0000000000474683 in clock_set ()
    [New Thread 0x7ffff2853700 (LWP 12536)]
    
    Thread 8182 (Thread 0x7ffff2853700 (LWP 12536)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:83
    #1  0x00007ffff73a8ea0 in __nptl_deallocate_tsd () at pthread_create.c:173
    #2  0x0000000000000000 in ?? ()
    
    Thread 8181 (Thread 0x7ffff2052700 (LWP 12535)):
    #0  0x0000000000474683 in clock_set ()
    #1  0x00007ffff42ae369 in tcpclient_child_send (w=0x7ffff413c298) at tcpclient.c:387
    #2  0x00007ffff73a8f6e in start_thread (arg=0x7ffff2052700) at pthread_create.c:311
    #3  0x00007ffff6ecf9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    
    Thread 1 (Thread 0x7ffff7fc3740 (LWP 4179)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:83
    #1  0x00007ffff73a7f25 in do_clone (pd=pd@entry=0x7ffff2853700, attr=attr@entry=0x7ffff3b73138, stackaddr=<optimized out>, stopped=stopped@entry=1, fct=0x7ffff73a8ea0 <start_thread>, clone_flags=4001536) at ../nptl/sysdeps/pthread/createthread.c:74
    #2  0x00007ffff73a9a0d in create_thread (stackaddr=<optimized out>, attr=0x7ffff3b73138, pd=0x7ffff2853700) at ../nptl/sysdeps/pthread/createthread.c:200
    #3  __pthread_create_2_1 (newthread=0x7ffff3d63540, attr=0x7ffff3b73138, start_routine=0x7ffff42ae330 <tcpclient_child_send>, arg=0x7ffff3d53538) at pthread_create.c:584
    #4  0x00007ffff42aee9c in tcpclient_send_buf (x=0x7ffff3a63010, buf_len=<optimized out>, buf=0x7ffff44b11e0 <byte_buf.6266> "\002") at tcpclient.c:369
    #5  0x00007ffff42af072 in tcpclient_send (x=0x7ffff3a63010, s=<optimized out>, argc=1, argv=<optimized out>) at tcpclient.c:343
    #6  0x00000000004664fc in pd_defaultfloat ()
    #7  0x00000000004692bf in outlet_float ()
    #8  0x0000000000472b2b in binbuf_eval ()
    #9  0x0000000000469199 in outlet_bang ()
    #10 0x00000000004bdd78 in trigger_list ()
    #11 0x0000000000469199 in outlet_bang ()
    #12 0x00000000004692bf in outlet_float ()
    #13 0x0000000000475877 in sched_tick ()
    #14 0x000000000047683b in m_mainloop ()
    #15 0x00007ffff6df6de5 in __libc_start_main (main=0x411800 <main>, argc=5, ubp_av=0x7fffffffe038, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe028) at libc-start.c:260
    #16 0x0000000000411831 in _start ()
    
     
  • Tested on Ubuntu 32 bit
    It doesn't crash but client1 sometimes receives unexpected data (2 instead of 1)

     
  • Martin Peach
    Martin Peach
    2014-02-23

    Which issue are you talking about? Wrong data or crashing?
    I don't get any crashes here. (debian squeeze 64bit, pd-extended 0.43.1)
    The terminal window fills with lots of identical messages like:
    tcpserver: recv: Connection reset by peer (104)
    It looks to me as though [tcpserver] in your patch will send both 2 and 1 to each client since it's broadcasting, but naturally only an open client will receive anything.

     
  • Which issue are you talking about? Wrong data or crashing?

    both, and I got this backtrace from my patch post on 2014/02/13 :

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7fffdddc3700 (LWP 30684)]
    0x0000000000474683 in clock_set ()
    [New Thread 0x7fffdf40f700 (LWP 30685)]
    
    Thread 1676 (Thread 0x7fffdf40f700 (LWP 30685)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:83
    #1  0x00007ffff73a8ea0 in __nptl_deallocate_tsd () at pthread_create.c:173
    #2  0x0000000000000000 in ?? ()
    
    Thread 1675 (Thread 0x7fffdddc3700 (LWP 30684)):
    #0  0x0000000000474683 in clock_set ()
    #1  0x00007fffde9fb36d in tcpclient_child_send (w=0x11d8108) at tcpclient.c:389
    #2  0x00007ffff73a8f6e in start_thread (arg=0x7fffdddc3700) at pthread_create.c:311
    #3  0x00007ffff6ecf9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    
    Thread 1 (Thread 0x7ffff7fc3740 (LWP 28905)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:83
    #1  0x00007ffff73a7f25 in do_clone (pd=pd@entry=0x7fffdf40f700, attr=attr@entry=0x7ffffffed8e0, stackaddr=<optimized out>, stopped=stopped@entry=1, fct=0x7ffff73a8ea0 <start_thread>, clone_flags=4001536) at ../nptl/sysdeps/pthread/createthread.c:74
    #2  0x00007ffff73a9a0d in create_thread (stackaddr=<optimized out>, attr=0x7ffffffed8e0, pd=0x7fffdf40f700) at ../nptl/sysdeps/pthread/createthread.c:200
    #3  __pthread_create_2_1 (newthread=0x7ffffffec8d8, attr=0x7ffffffed8e0, start_routine=0x7fffde7d47f0 <tcpserver_send_buf_thread>, arg=0x851a80) at pthread_create.c:584
    #4  0x00007fffde7d526b in tcpserver_send_bytes (client=0, x=0xe55b50, argc=1, argv=0x7ffffffedad0) at tcpserver.c:509
    #5  0x0000000000467a29 in pd_typedmess ()
    #6  0x000000000046947a in outlet_anything ()
    #7  0x000000000046795b in pd_typedmess ()
    #8  0x00000000004728aa in binbuf_eval ()
    #9  0x00000000004693ea in outlet_list ()
    #10 0x00000000004bc353 in pack_bang ()
    #11 0x00000000004693ea in outlet_list ()
    #12 0x00000000004bdc9d in trigger_list ()
    #13 0x00000000004be020 in trigger_float ()
    #14 0x00000000004692bf in outlet_float ()
    #15 0x00007fffde7d3e04 in tcpserver_socketreceiver_doread (x=0x759710) at tcpserver.c:252
    #16 0x000000000047b17a in sys_domicrosleep.constprop.3 ()
    #17 0x000000000047cf5a in sys_pollgui ()
    #18 0x000000000047664e in m_mainloop ()
    #19 0x00007ffff6df6de5 in __libc_start_main (main=0x411800 <main>, argc=3, ubp_av=0x7fffffffe048, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe038) at libc-start.c:260
    #20 0x0000000000411831 in _start ()
    

    This happens on Ubuntu 13.10 with Pd 0.45-4.

    It looks to me as though [tcpserver] in your patch will send both 2 and 1 to each client since it's broadcasting, but naturally only an open client will receive anything.

    According to the tcpserver-help.pd, message [send <socket#> <list>( send the list to client on socket `socket#.
    This is what I'm using in my patch.
    Moreover on this pd-patch, the client disconnect only when it receive data from server.
    But if server is sending wrong data to client, then the client disconnects and connects again (at least if [tcpclient] reports connection status right) and sends data again.

     
  • Martin Peach
    Martin Peach
    2014-02-23

    That crash report is for tcpclient. Are you using the latest version?

     
  • That crash report is for tcpclient.

    Right, but it's triggered by the same patch so I think it's related and I didn't open a new issue.

    Are you using the latest version?

    I think yes, I've done a make clean, then svn update and then make.

    The test patch is running on 3 differents computer today (Ubuntu 12.04 64bit, and Ubuntu 13.01 64bit and Ubuntu 12.04 32bit) it crashes only on 64bit.
    Pd crashes faster alone than in gdb or valgrind.
    I still got wrong data in the client side, I will open a separate issue for that.
    Also Pd hangs sometimes, and crashes but this happened when I was interacting with GUI.
    Anyway, outside of gdb or valgrind the patch crashes after few send/receive iterations (less than 5 secondes after loadbang).

    Here is a backtrace recorded today :

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7ffff2b0b700 (LWP 20734)]
    0x0000000000474683 in clock_set ()
    [New Thread 0x7ffff330c700 (LWP 20735)]
    
    Thread 40992 (Thread 0x7ffff330c700 (LWP 20735)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:84
    #1  0x00007ffff73acdc0 in ?? () at pthread_create.c:172 from /lib/x86_64-linux-gnu/libpthread.so.0
    #2  0x00007ffff330c700 in ?? ()
    #3  0x0000000000000000 in ?? ()
    
    Thread 40991 (Thread 0x7ffff2b0b700 (LWP 20734)):
    #0  0x0000000000474683 in clock_set ()
    #1  0x00007ffff3b5835d in tcpclient_child_send (w=0x7ffff341d178) at tcpclient.c:389
    #2  0x00007ffff73ace9a in start_thread (arg=0x7ffff2b0b700) at pthread_create.c:308
    #3  0x00007ffff6ed53fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
    #4  0x0000000000000000 in ?? ()
    
    Thread 1 (Thread 0x7ffff7fd5740 (LWP 11762)):
    #0  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:84
    #1  0x00007ffff73abf70 in do_clone (pd=0x7ffff330c700, attr=0x7ffffffed940, stackaddr=<optimized out>, stopped=1, fct=0x7ffff73acdc0 <start_thread>, clone_flags=4001536) at ../nptl/sysdeps/pthread/createthread.c:75
    #2  0x00007ffff73ad8ba in create_thread (stackaddr=<optimized out>, attr=0x7ffffffed940, pd=0x7ffff330c700) at ../nptl/sysdeps/pthread/createthread.c:212
    #3  __pthread_create_2_1 (newthread=0x7ffffffec938, attr=0x7ffffffed940, start_routine=0x7ffff36207e0 <tcpserver_send_buf_thread>, arg=0x7632b0) at pthread_create.c:566
    #4  0x00007ffff362125b in tcpserver_send_bytes (client=0, x=0x7ffff7ec1010, argc=1, argv=0x7ffffffedb30) at tcpserver.c:509
    #5  0x0000000000467a29 in pd_typedmess ()
    #6  0x000000000046947a in outlet_anything ()
    #7  0x000000000046795b in pd_typedmess ()
    #8  0x00000000004728aa in binbuf_eval ()
    #9  0x00000000004693ea in outlet_list ()
    #10 0x00000000004bc353 in pack_bang ()
    #11 0x00000000004693ea in outlet_list ()
    #12 0x00000000004bdc9d in trigger_list ()
    #13 0x00000000004be020 in trigger_float ()
    #14 0x00000000004692bf in outlet_float ()
    #15 0x00007ffff361fdf4 in tcpserver_socketreceiver_doread (x=0x764700) at tcpserver.c:252
    #16 0x000000000047b17a in sys_domicrosleep.constprop.3 ()
    #17 0x000000000047cf5a in sys_pollgui ()
    #18 0x000000000047664e in m_mainloop ()
    #19 0x00007ffff6e0276d in __libc_start_main (main=0x411800 <main>, argc=4, ubp_av=0x7fffffffe0a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe098) at libc-start.c:226
    #20 0x0000000000411831 in _start ()
    

    So the issue may be in several places, both in the tcpserver side (which could be fixed now with your last changes) and in the tcpclient.

     


Anonymous


Cancel   Add attachments