From: David E. <da...@2g...> - 2003-04-02 17:30:26
|
Hello, First, the obligatory credits: Thank you very much for writing valgrind! When I recently subscribed to this mailing list I read a message from Nicholas Nethercote where he said that 1.9.4 was more stable and worked better than 1.0.4 so I thought I'd give it a try. However, it seems to me like something has changed between the two versions, maybe regarding the combination threads and sockets. I am running RedHat 8.0, with the latest updates installed: kernel-2.4.18-27.8.0 glibc-2.3.2-4.80 gcc-3.2-7 The application I'm examining with valgrind is a threaded server application written in C that uses glib. It is quite large and complicated to install, so there is no point in providing it. I have tried to make a scaled-down example to reproduce my problem. It is not finished yet, but I thought that I'll try to describe the problem in case anyone has any suggestions: The server creates a unix socket and makes glib poll() the socket for events. A client connects to the server, which accept() the connection and creates a new thread for handling the connection. By using simple "printf debugging" it seems like the the new thread never gets scheduled to run in valgrind 1.9.4. For the fun of it, I tried to send a SIGHUP to my process. There is singal handler installed with signal() for SIGHUP, which gets executed properly. What also happens is that all threads (one for each connection that have been made to the server) get to run! Do you have any ideas? I will happily provide any more information that may be of interest. Regards, -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Jeff J. <jb...@re...> - 2003-04-02 17:41:38
|
On Wed, Apr 02, 2003 at 07:02:19PM +0200, David Eriksson wrote: > Hello, > > First, the obligatory credits: Thank you very much for writing valgrind! > > When I recently subscribed to this mailing list I read a message from > Nicholas Nethercote where he said that 1.9.4 was more stable and worked > better than 1.0.4 so I thought I'd give it a try. > > However, it seems to me like something has changed between the two > versions, maybe regarding the combination threads and sockets. > > I am running RedHat 8.0, with the latest updates installed: > > kernel-2.4.18-27.8.0 > glibc-2.3.2-4.80 > gcc-3.2-7 > > The application I'm examining with valgrind is a threaded server > application written in C that uses glib. It is quite large and > complicated to install, so there is no point in providing it. > > I have tried to make a scaled-down example to reproduce my problem. It > is not finished yet, but I thought that I'll try to describe the problem > in case anyone has any suggestions: > > The server creates a unix socket and makes glib poll() the socket for > events. A client connects to the server, which accept() the connection > and creates a new thread for handling the connection. > > By using simple "printf debugging" it seems like the the new thread > never gets scheduled to run in valgrind 1.9.4. > > For the fun of it, I tried to send a SIGHUP to my process. There is > singal handler installed with signal() for SIGHUP, which gets executed > properly. What also happens is that all threads (one for each connection > that have been made to the server) get to run! > > Do you have any ideas? FYI: You're running a NPTL capable library with a NPTL deprived kernel. Meanwhile, prefix your valgrind command with LD_ASSUME_KERNEL=2.2.5 valgrind ... to force use of Good Old libpthread. 73 de Jeff -- Jeff Johnson ARS N3NPQ jb...@re... (jb...@jb...) Chapel Hill, NC |
From: Jeremy F. <je...@go...> - 2003-04-02 17:51:45
|
On Wed, 2003-04-02 at 09:41, Jeff Johnson wrote: > FYI: You're running a NPTL capable library with a NPTL deprived kernel. > > Meanwhile, prefix your valgrind command with > LD_ASSUME_KERNEL=2.2.5 valgrind ... > to force use of Good Old libpthread. Are you sure? There's only one libpthread in that glibc package; can the one library do both NPTL and linuxthreads? J |
From: David E. <da...@2g...> - 2003-04-02 18:50:30
|
On Wed, 2003-04-02 at 19:41, Jeff Johnson wrote: > On Wed, Apr 02, 2003 at 07:02:19PM +0200, David Eriksson wrote: > > Hello, > > > > First, the obligatory credits: Thank you very much for writing valgrind! > > > > When I recently subscribed to this mailing list I read a message from > > Nicholas Nethercote where he said that 1.9.4 was more stable and worked > > better than 1.0.4 so I thought I'd give it a try. > > > > However, it seems to me like something has changed between the two > > versions, maybe regarding the combination threads and sockets. > > > > I am running RedHat 8.0, with the latest updates installed: > > > > kernel-2.4.18-27.8.0 > > glibc-2.3.2-4.80 > > gcc-3.2-7 > > > > The application I'm examining with valgrind is a threaded server > > application written in C that uses glib. It is quite large and > > complicated to install, so there is no point in providing it. > > > > I have tried to make a scaled-down example to reproduce my problem. It > > is not finished yet, but I thought that I'll try to describe the problem > > in case anyone has any suggestions: > > > > The server creates a unix socket and makes glib poll() the socket for > > events. A client connects to the server, which accept() the connection > > and creates a new thread for handling the connection. > > > > By using simple "printf debugging" it seems like the the new thread > > never gets scheduled to run in valgrind 1.9.4. > > > > For the fun of it, I tried to send a SIGHUP to my process. There is > > singal handler installed with signal() for SIGHUP, which gets executed > > properly. What also happens is that all threads (one for each connection > > that have been made to the server) get to run! > > > > Do you have any ideas? > > FYI: You're running a NPTL capable library with a NPTL deprived kernel. > > Meanwhile, prefix your valgrind command with > LD_ASSUME_KERNEL=2.2.5 valgrind ... > to force use of Good Old libpthread. Well, pthread is never loaded when using valgrind as valgrind maintains its own thread scheduling: ... ==29456== Reading syms from /usr/local/lib/valgrind/libpthread.so ... -- -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Jeremy F. <je...@go...> - 2003-04-02 17:53:51
|
On Wed, 2003-04-02 at 09:02, David Eriksson wrote: > For the fun of it, I tried to send a SIGHUP to my process. There is > singal handler installed with signal() for SIGHUP, which gets executed > properly. What also happens is that all threads (one for each connection > that have been made to the server) get to run! > > Do you have any ideas? > > I will happily provide any more information that may be of interest. What does strace have to say about your running process? Is it blocked in a a system call, or does it seem to be spinning away happily? It's possible that the libc poll (or some other blocking system call) is somehow getting called without being intercepted by Valgrind. J |
From: David E. <da...@2g...> - 2003-04-02 18:56:33
|
On Wed, 2003-04-02 at 19:53, Jeremy Fitzhardinge wrote: > On Wed, 2003-04-02 at 09:02, David Eriksson wrote: > > For the fun of it, I tried to send a SIGHUP to my process. There is > > singal handler installed with signal() for SIGHUP, which gets executed > > properly. What also happens is that all threads (one for each connection > > that have been made to the server) get to run! > > > > Do you have any ideas? > > > > I will happily provide any more information that may be of interest. > What does strace have to say about your running process? Is it > blocked in a a system call, or does it seem to be spinning away > happily? > > It's possible that the libc poll (or some other blocking system call) > is somehow getting called without being intercepted by Valgrind. Strace stops in poll, and if I attach to the server process with gdb I get this stacktrace: (gdb) bt #0 0x40183272 in vgPlain_do_syscall () from /usr/local/lib/valgrind/valgrind.so #1 0x4023c4d0 in __JCR_LIST__ () from /usr/lib/libglib-1.2.so.0 #2 0x40170c97 in poll (__fds=0x4223bf3c, __nfds=0x2, __timeout=0xea60) at vg_intercept.c:194 #3 0x4022a3cb in g_main_poll () from /usr/lib/libglib-1.2.so.0 #4 0x40229c95 in g_main_iterate () from /usr/lib/libglib-1.2.so.0 #5 0x4022a0f4 in g_main_run () from /usr/lib/libglib-1.2.so.0 #6 0x0804c671 in main (argc=0x0, argv=0xbfffe8e4) at smaccd.c:616 #7 0x403d3907 in __libc_start_main () from /lib/libc.so.6 -- -\- David Eriksson -/- www.2GooD.nu "I personally refuse to use inferior tools because of ideology." - Linus Torvalds |
From: Jeremy F. <je...@go...> - 2003-04-02 19:22:31
|
On Wed, 2003-04-02 at 10:28, David Eriksson wrote: > Strace stops in poll, and if I attach to the server process with gdb I > get this stacktrace: > > (gdb) bt > #0 0x40183272 in vgPlain_do_syscall () from > /usr/local/lib/valgrind/valgrind.so > #1 0x4023c4d0 in __JCR_LIST__ () from /usr/lib/libglib-1.2.so.0 > #2 0x40170c97 in poll (__fds=0x4223bf3c, __nfds=0x2, __timeout=0xea60) > at vg_intercept.c:194 > #3 0x4022a3cb in g_main_poll () from /usr/lib/libglib-1.2.so.0 > #4 0x40229c95 in g_main_iterate () from /usr/lib/libglib-1.2.so.0 > #5 0x4022a0f4 in g_main_run () from /usr/lib/libglib-1.2.so.0 > #6 0x0804c671 in main (argc=0x0, argv=0xbfffe8e4) at smaccd.c:616 > #7 0x403d3907 in __libc_start_main () from /lib/libc.so.6 Hm, looks like the vg_intercept stuff isn't working - it's catching poll, but it isn't passing it into the threads library properly. What does ldd <your program> say? What is in /proc/<pid>/maps when you run it? What does the link command line look like? Did you manage to get a small standalone program to reproduce the problem? J |