|
From: Sunny D. <int...@ya...> - 2012-07-16 00:36:22
|
Folks, I need help with an issue while trying to run memcheck on one of our internal programs. These programs used to work in our earlier versions with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and now, the latest valgrind versions as well as the SVN version are broken. The following is a run with the SVN code as of today. Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. Program terminated with signal 11, Segmentation fault. #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 334 m_stacktrace.c: No such file or directory. (gdb) bt #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 #1 0x00000000380456d3 in vgPlain_get_StackTrace (tid=1, ips=0x4030dd040, max_n_ips=100, sps=0x0, fps=<optimized out>, first_ip_delta=0) at m_stacktrace.c:1086 #2 0x0000000038045889 in vgPlain_get_and_pp_StackTrace (tid=<optimized out>, max_n_ips=<optimized out>) at m_stacktrace.c:1125 #3 0x00000000380302d3 in vgPlain_show_sched_status () at m_libcassert.c:213 #4 0x00000000380303b3 in report_and_quit (report=0x3826e859 "www.valgrind.org", startRegsIN=<optimized out>) at m_libcassert.c:253 #5 0x000000003803043f in panic (name=0x38279d6b "valgrind", report=0x3826e859 "www.valgrind.org", str=<optimized out>, startRegs=0x4030dd7a0) at m_libcassert.c:319 #6 0x0000000038030649 in vgPlain_core_panic_at (str=0x38ad2ae8 "\023v\277\n", startRegs=0x9) at m_libcassert.c:324 #7 0x0000000038044579 in sync_signalhandler_from_kernel (uc=0x4030dd800, info=<optimized out>, sigNo=11, tid=1) at m_signals.c:2433 #8 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd800) at m_signals.c:2490 #9 0x0000000038042220 in ?? () #10 0x0000000000000000 in ?? () Let me know if you need to see some variables or state. Unfortunately, I can't ship the core or the binaries per company policy...:( Looks like the program generated SEGV and valgrind is trying to print the program's stack and that's when its dying. Any ways to work around this? Any other ideas? Can I make valgrind not try to dump stack and just dump core? One important piece of info: all the programs are dying with the same stack trace and all of them link against GNU PTH library. Thanks! -Sunny |
|
From: Julian S. <js...@ac...> - 2012-07-16 09:01:03
|
Can you try with r12748. There was a change relating to stack unwinding and segfaults at r12749 and I would like to be sure that this is not complicating matters. svn co -r12748 svn://svn.valgrind.org/valgrind/trunk to get a suitable tree. J On Monday, July 16, 2012, Sunny Das wrote: > Folks, > > I need help with an issue while trying to run memcheck on one of our > internal programs. These programs used to work in our earlier versions > with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and > now, the latest valgrind versions as well as the SVN version are broken. > The following is a run with the SVN code as of today. > > Core was generated by `valgrind --leak-check=full --num-callers=24 > --error-limit=no --show-reachable=y'. Program terminated with signal 11, > Segmentation fault. > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, > ips=0x4030dd040, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized > out>, fp_max_orig=34342965240) at m_stacktrace.c:334 334 > m_stacktrace.c: No such file or directory. > (gdb) bt > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, > ips=0x4030dd040, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized > out>, fp_max_orig=34342965240) at m_stacktrace.c:334 #1 > 0x00000000380456d3 in vgPlain_get_StackTrace (tid=1, ips=0x4030dd040, > max_n_ips=100, sps=0x0, fps=<optimized out>, first_ip_delta=0) at > m_stacktrace.c:1086 > #2 0x0000000038045889 in vgPlain_get_and_pp_StackTrace (tid=<optimized > out>, max_n_ips=<optimized out>) at m_stacktrace.c:1125 > #3 0x00000000380302d3 in vgPlain_show_sched_status () at > m_libcassert.c:213 #4 0x00000000380303b3 in report_and_quit > (report=0x3826e859 "www.valgrind.org", startRegsIN=<optimized out>) at > m_libcassert.c:253 > #5 0x000000003803043f in panic (name=0x38279d6b "valgrind", > report=0x3826e859 "www.valgrind.org", str=<optimized out>, > startRegs=0x4030dd7a0) at m_libcassert.c:319 #6 0x0000000038030649 in > vgPlain_core_panic_at (str=0x38ad2ae8 "\023v\277\n", startRegs=0x9) at > m_libcassert.c:324 > #7 0x0000000038044579 in sync_signalhandler_from_kernel (uc=0x4030dd800, > info=<optimized out>, sigNo=11, tid=1) at m_signals.c:2433 > #8 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd800) at > m_signals.c:2490 #9 0x0000000038042220 in ?? () > #10 0x0000000000000000 in ?? () > > Let me know if you need to see some variables or state. Unfortunately, I > can't ship the core or the binaries per company policy...:( Looks like the > program generated SEGV and valgrind is trying to print the program's stack > and that's when its dying. > > Any ways to work around this? Any other ideas? Can I make valgrind not try > to dump stack and just dump core? > > > One important piece of info: all the programs are dying with the same stack > trace and all of them link against GNU PTH library. > > Thanks! > -Sunny > > > --------------------------------------------------------------------------- > --- Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Tom H. <to...@co...> - 2012-07-16 10:07:30
|
On 16/07/12 01:36, Sunny Das wrote: > I need help with an issue while trying to run memcheck on one of our internal programs. These programs used to work in our earlier versions with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and now, the latest valgrind versions as well as the SVN version are broken. The following is a run with the SVN code as of today. > > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, > sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 > 334 m_stacktrace.c: No such file or directory. The problem is that it has fallen back to a frame pointer based unwind, but that will generally fail on amd64 because the compiler will default to not using frame pointers, so when valgrind tries to dereference what it thinks is the frame pointer it will likely crash. This suggests that you have some code that hasn't been compiled with DWARF unwind information, as valgrind will use that in preference to doing a frame pointer based unwind. Given that you say this only happens with the pth library, maybe that is what doesn't have unwind information? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2012-07-16 12:18:44
|
On Monday, July 16, 2012, Tom Hughes wrote: > Given that you say this only happens with the pth library, maybe that is > what doesn't have unwind information? It might be that pth has some handwritten bits of assembly around (very likely, for a threading library) and they don't have CFI info. (random guess) J |
|
From: Sunny D. <int...@ya...> - 2012-07-16 19:24:04
|
You are a life saver Tom. Indeed, pth wasn't built correctly. Once that was taken care of, I am up and running. Thanks a bunch! -Sunny ----- Original Message ----- From: Tom Hughes <to...@co...> To: Sunny Das <int...@ya...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 3:07 AM Subject: Re: valgrind internal error On 16/07/12 01:36, Sunny Das wrote: > I need help with an issue while trying to run memcheck on one of our internal programs. These programs used to work in our earlier versions with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and now, the latest valgrind versions as well as the SVN version are broken. The following is a run with the SVN code as of today. > > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, > sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 > 334 m_stacktrace.c: No such file or directory. The problem is that it has fallen back to a frame pointer based unwind, but that will generally fail on amd64 because the compiler will default to not using frame pointers, so when valgrind tries to dereference what it thinks is the frame pointer it will likely crash. This suggests that you have some code that hasn't been compiled with DWARF unwind information, as valgrind will use that in preference to doing a frame pointer based unwind. Given that you say this only happens with the pth library, maybe that is what doesn't have unwind information? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Sunny D. <int...@ya...> - 2012-07-16 22:53:30
|
Any ideas what could this be? All programs are now running fine with revision r12748 but this one program (compiling and linking is same as others) still dies like this: Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. Program terminated with signal 11, Segmentation fault. #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) at m_debuginfo/misc.c:195 195 m_debuginfo/misc.c: No such file or directory. (gdb) bt #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) at m_debuginfo/misc.c:195 #1 0x000000003805cef6 in vgPlain_use_CF_info (uregsHere=0x4030dced0, min_accessible=200117488, max_accessible=34342965240) at m_debuginfo/debuginfo.c:2489 #2 0x00000000380454a3 in vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:309 #3 0x0000000038045663 in vgPlain_get_StackTrace (tid=1, ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=<optimized out>, first_ip_delta=0) at m_stacktrace.c:1086 #4 0x0000000038045819 in vgPlain_get_and_pp_StackTrace (tid=<optimized out>, max_n_ips=<optimized out>) at m_stacktrace.c:1125 #5 0x00000000380302d3 in vgPlain_show_sched_status () at m_libcassert.c:213 #6 0x00000000380303b3 in report_and_quit (report=0x3826e7f9 "www.valgrind.org", startRegsIN=<optimized out>) at m_libcassert.c:253 #7 0x000000003803043f in panic (name=0x38279d0b "valgrind", report=0x3826e7f9 "www.valgrind.org", str=<optimized out>, startRegs=0x4030dd720) at m_libcassert.c:319 #8 0x0000000038030649 in vgPlain_core_panic_at (str=0x7ff000ff0 "cg.org", startRegs=0x4009621) at m_libcassert.c:324 #9 0x0000000038044509 in sync_signalhandler_from_kernel (uc=0x4030dd780, info=<optimized out>, sigNo=11, tid=1) at m_signals.c:2413 #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) at m_signals.c:2470 #11 0x0000000038042220 in ?? () #12 0x0000000000000000 in ?? () -Sunny ----- Original Message ----- From: Sunny Das <int...@ya...> To: Tom Hughes <to...@co...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 12:23 PM Subject: Re: [Valgrind-developers] valgrind internal error You are a life saver Tom. Indeed, pth wasn't built correctly. Once that was taken care of, I am up and running. Thanks a bunch! -Sunny ----- Original Message ----- From: Tom Hughes <to...@co...> To: Sunny Das <int...@ya...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 3:07 AM Subject: Re: valgrind internal error On 16/07/12 01:36, Sunny Das wrote: > I need help with an issue while trying to run memcheck on one of our internal programs. These programs used to work in our earlier versions with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and now, the latest valgrind versions as well as the SVN version are broken. The following is a run with the SVN code as of today. > > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, > sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 > 334 m_stacktrace.c: No such file or directory. The problem is that it has fallen back to a frame pointer based unwind, but that will generally fail on amd64 because the compiler will default to not using frame pointers, so when valgrind tries to dereference what it thinks is the frame pointer it will likely crash. This suggests that you have some code that hasn't been compiled with DWARF unwind information, as valgrind will use that in preference to doing a frame pointer based unwind. Given that you say this only happens with the pth library, maybe that is what doesn't have unwind information? Tom -- Tom Hughes (to...@co...) http://compton.nu/ ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Valgrind-developers mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Sunny D. <int...@ya...> - 2012-07-17 19:59:28
|
Guys, Please help. I have no idea what this SEGV is about. Thanks a bunch! -Sunny ----- Original Message ----- From: Sunny Das <int...@ya...> To: Sunny Das <int...@ya...>; Tom Hughes <to...@co...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 3:53 PM Subject: Re: [Valgrind-developers] valgrind internal error Any ideas what could this be? All programs are now running fine with revision r12748 but this one program (compiling and linking is same as others) still dies like this: Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. Program terminated with signal 11, Segmentation fault. #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) at m_debuginfo/misc.c:195 195 m_debuginfo/misc.c: No such file or directory. (gdb) bt #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) at m_debuginfo/misc.c:195 #1 0x000000003805cef6 in vgPlain_use_CF_info (uregsHere=0x4030dced0, min_accessible=200117488, max_accessible=34342965240) at m_debuginfo/debuginfo.c:2489 #2 0x00000000380454a3 in vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:309 #3 0x0000000038045663 in vgPlain_get_StackTrace (tid=1, ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=<optimized out>, first_ip_delta=0) at m_stacktrace.c:1086 #4 0x0000000038045819 in vgPlain_get_and_pp_StackTrace (tid=<optimized out>, max_n_ips=<optimized out>) at m_stacktrace.c:1125 #5 0x00000000380302d3 in vgPlain_show_sched_status () at m_libcassert.c:213 #6 0x00000000380303b3 in report_and_quit (report=0x3826e7f9 "www.valgrind.org", startRegsIN=<optimized out>) at m_libcassert.c:253 #7 0x000000003803043f in panic (name=0x38279d0b "valgrind", report=0x3826e7f9 "www.valgrind.org", str=<optimized out>, startRegs=0x4030dd720) at m_libcassert.c:319 #8 0x0000000038030649 in vgPlain_core_panic_at (str=0x7ff000ff0 "cg.org", startRegs=0x4009621) at m_libcassert.c:324 #9 0x0000000038044509 in sync_signalhandler_from_kernel (uc=0x4030dd780, info=<optimized out>, sigNo=11, tid=1) at m_signals.c:2413 #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) at m_signals.c:2470 #11 0x0000000038042220 in ?? () #12 0x0000000000000000 in ?? () -Sunny ----- Original Message ----- From: Sunny Das <int...@ya...> To: Tom Hughes <to...@co...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 12:23 PM Subject: Re: [Valgrind-developers] valgrind internal error You are a life saver Tom. Indeed, pth wasn't built correctly. Once that was taken care of, I am up and running. Thanks a bunch! -Sunny ----- Original Message ----- From: Tom Hughes <to...@co...> To: Sunny Das <int...@ya...> Cc: "val...@li..." <val...@li...> Sent: Monday, July 16, 2012 3:07 AM Subject: Re: valgrind internal error On 16/07/12 01:36, Sunny Das wrote: > I need help with an issue while trying to run memcheck on one of our internal programs. These programs used to work in our earlier versions with valgrind 3.5 and glibc 2.11. We had to upgrade glibc to 2.13, and now, the latest valgrind versions as well as the SVN version are broken. The following is a run with the SVN code as of today. > > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, ips=0x4030dd040, max_n_ips=100, > sps=0x0, fps=0x0, startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:334 > 334 m_stacktrace.c: No such file or directory. The problem is that it has fallen back to a frame pointer based unwind, but that will generally fail on amd64 because the compiler will default to not using frame pointers, so when valgrind tries to dereference what it thinks is the frame pointer it will likely crash. This suggests that you have some code that hasn't been compiled with DWARF unwind information, as valgrind will use that in preference to doing a frame pointer based unwind. Given that you say this only happens with the pth library, maybe that is what doesn't have unwind information? Tom -- Tom Hughes (to...@co...) http://compton.nu/ ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Valgrind-developers mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: John R. <jr...@bi...> - 2012-07-17 21:44:08
|
On 07/17/2012 12:59 PM, Sunny Das wrote: > Please help. I have no idea what this SEGV is about. Having "no idea" is part of why you get no response. _You_ *should* be able to help a lot. From frames #9 and #10, it can be seen that the kernel sent a signal to the process. Find out as much as you can about that signal. Run valgrind under strace and/or gdb, and show us the relevant info: the faulting address, pc, instruction stream surrounding that pc, CPU registers, 32 words before and after stack pointer, address space [(gdb) info proc; (gdb) shell cat /proc/<PID>/maps], etc. >From frames #2, #1, and #0, it can be seen that valgrind gets confused about debug info. Show us the debuginfo for your program: "objdump --debugging ./my_app" If you can't post it, then try to figure out what is unique about the debuginfo for ./my_app. Compare it with the debuginfo for a similar app that does work under valgrind, and show us the _logical_ differences. This requires thinking (!), not just copy+paste of command lines and [attaching compressed] text output, not _just_ running 'diff', etc. Does using the valgrind option --read-var-info=yes help? What about --debug-dump=frames or --debug-dump=syms ? And of course, run with "valgrind --verbose ..." to start with. > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) > at m_debuginfo/misc.c:195 > 195 m_debuginfo/misc.c: No such file or directory. > (gdb) bt > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) > at m_debuginfo/misc.c:195 > #1 0x000000003805cef6 in vgPlain_use_CF_info (uregsHere=0x4030dced0, min_accessible=200117488, > max_accessible=34342965240) at m_debuginfo/debuginfo.c:2489 > #2 0x00000000380454a3 in vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, > ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, > fp_max_orig=34342965240) at m_stacktrace.c:309 > #9 0x0000000038044509 in sync_signalhandler_from_kernel (uc=0x4030dd780, info=<optimized out>, > sigNo=11, tid=1) at m_signals.c:2413 > #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) at m_signals.c:2470 -- |
|
From: Sunny D. <int...@ya...> - 2012-07-19 12:16:24
|
John, I get what you are saying. But when you have many things to deliver on the project, you don't have time for this kind of troubleshooting of the support tools, which you are using to troubleshoot your real issues and fixing them. This is a boot time daemon with timeout based interactions in a cluster setup where nodes start to get evicted when bad things (like a critical daemon dies) happen, and there is a whole lot of shebang that's needed to get it to a state where I can vgdb into it. I was throwing it out there with the hope that someone will look at the backtrace and go "I know what that could be" like the last one with missing DWARF info. Since, that did not happen, I think we are left with deep dive vgdb option only. Right now, I am debugging something else but I will come back to this. And I am gonna need your help. Thanks, -Sunny ----- Original Message ----- From: John Reiser <jr...@bi...> To: val...@li... Cc: Sent: Tuesday, July 17, 2012 2:45 PM Subject: Re: [Valgrind-developers] valgrind internal error On 07/17/2012 12:59 PM, Sunny Das wrote: > Please help. I have no idea what this SEGV is about. Having "no idea" is part of why you get no response. _You_ *should* be able to help a lot. From frames #9 and #10, it can be seen that the kernel sent a signal to the process. Find out as much as you can about that signal. Run valgrind under strace and/or gdb, and show us the relevant info: the faulting address, pc, instruction stream surrounding that pc, CPU registers, 32 words before and after stack pointer, address space [(gdb) info proc; (gdb) shell cat /proc/<PID>/maps], etc. >From frames #2, #1, and #0, it can be seen that valgrind gets confused about debug info. Show us the debuginfo for your program: "objdump --debugging ./my_app" If you can't post it, then try to figure out what is unique about the debuginfo for ./my_app. Compare it with the debuginfo for a similar app that does work under valgrind, and show us the _logical_ differences. This requires thinking (!), not just copy+paste of command lines and [attaching compressed] text output, not _just_ running 'diff', etc. Does using the valgrind option --read-var-info=yes help? What about --debug-dump=frames or --debug-dump=syms ? And of course, run with "valgrind --verbose ..." to start with. > Core was generated by `valgrind --leak-check=full --num-callers=24 --error-limit=no --show-reachable=y'. > Program terminated with signal 11, Segmentation fault. > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) > at m_debuginfo/misc.c:195 > 195 m_debuginfo/misc.c: No such file or directory. > (gdb) bt > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out of bounds>) > at m_debuginfo/misc.c:195 > #1 0x000000003805cef6 in vgPlain_use_CF_info (uregsHere=0x4030dced0, min_accessible=200117488, > max_accessible=34342965240) at m_debuginfo/debuginfo.c:2489 > #2 0x00000000380454a3 in vgPlain_get_StackTrace_wrk (tid_if_known=<optimized out>, > ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=0x0, startRegs=<optimized out>, > fp_max_orig=34342965240) at m_stacktrace.c:309 > #9 0x0000000038044509 in sync_signalhandler_from_kernel (uc=0x4030dd780, info=<optimized out>, > sigNo=11, tid=1) at m_signals.c:2413 > #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) at m_signals.c:2470 -- ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Valgrind-developers mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Julian S. <js...@ac...> - 2012-07-19 13:37:44
|
Sunny, Try looking at it like this. Valgrind developer time is very limited. Your bug report is (effectively) competing for that time, against all other bug reports and all other tasks that the developers have to do. Some of the other bug reports you are competing against are pretty impressive, with detailed analyses and instructions on how to reproduce. In order for your bug to "get ahead" relative to the vast stack of other bugs and tasks, you need to make your bug "more competitive", for lack of a better phrase: * file a bug report, as per http://valgrind.org/support/bug_reports.html, and continue the discussion on the report, not on this mail list. Bugs notified to the developer lists tend to get forgotten about or lost. * be prepared to give how-to-reproduce information, and accurate details as required w.r.t. the original stack trace, Valgrind segfaulted, and then in the process of trying to produce a stack trace showing where the segfault was, it got a second segfault. This can be seen (it is not obvious) from the outermost frames in the trace: #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) at m_signals.c:2470 #11 0x0000000038042220 in ?? () #12 0x0000000000000000 in ?? () 0x0000000038042220 is a code address inside the tool executable itself (memcheck/memcheck-amd64-linux, or whatever you were running). A first step would be to use objdump and addr2line to find out what point in the valgrind source code this corresponds to. J On Thursday, July 19, 2012, Sunny Das wrote: > John, > > I get what you are saying. But when you have many things to deliver on the > project, you don't have time for this kind of troubleshooting of the > support tools, which you are using to troubleshoot your real issues and > fixing them. This is a boot time daemon with timeout based interactions in > a cluster setup where nodes start to get evicted when bad things (like a > critical daemon dies) happen, and there is a whole lot of shebang that's > needed to get it to a state where I can vgdb into it. I was throwing it > out there with the hope that someone will look at the backtrace and go "I > know what that could be" like the last one with missing DWARF info. Since, > that did not happen, I think we are left with deep dive vgdb option only. > Right now, I am debugging something else but I will come back to this. And > I am gonna need your help. > > > Thanks, > -Sunny > > > > ----- Original Message ----- > From: John Reiser <jr...@bi...> > To: val...@li... > Cc: > Sent: Tuesday, July 17, 2012 2:45 PM > Subject: Re: [Valgrind-developers] valgrind internal error > > On 07/17/2012 12:59 PM, Sunny Das wrote: > > Please help. I have no idea what this SEGV is about. > > Having "no idea" is part of why you get no response. > > _You_ *should* be able to help a lot. From frames #9 and #10, it can be > seen that the kernel sent a signal to the process. Find out as much as > you can about that signal. Run valgrind under strace and/or gdb, and show > us the relevant info: the faulting address, pc, instruction stream > surrounding that pc, CPU registers, 32 words before and after stack > pointer, address space [(gdb) info proc; > (gdb) shell cat /proc/<PID>/maps], etc. > > >From frames #2, #1, and #0, it can be seen that valgrind gets confused > >about > > debug info. Show us the debuginfo for your program: "objdump --debugging > ./my_app" If you can't post it, then try to figure out what is unique > about the debuginfo for ./my_app. Compare it with the debuginfo for a > similar app that does work under valgrind, and show us the _logical_ > differences. This requires thinking (!), not just copy+paste of command > lines and [attaching compressed] text output, not _just_ running 'diff', > etc. > > Does using the valgrind option --read-var-info=yes help? > What about --debug-dump=frames or --debug-dump=syms ? > And of course, run with "valgrind --verbose ..." to start with. > > > Core was generated by `valgrind --leak-check=full --num-callers=24 > > --error-limit=no --show-reachable=y'. Program terminated with signal 11, > > Segmentation fault. > > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out > > of bounds>) > > > > at m_debuginfo/misc.c:195 > > > > 195 m_debuginfo/misc.c: No such file or directory. > > (gdb) bt > > #0 vgModuleLocal_read_ULong (data=0x100000083 <Address 0x100000083 out > > of bounds>) > > > > at m_debuginfo/misc.c:195 > > > > #1 0x000000003805cef6 in vgPlain_use_CF_info (uregsHere=0x4030dced0, > > min_accessible=200117488, > > > > max_accessible=34342965240) at m_debuginfo/debuginfo.c:2489 > > > > #2 0x00000000380454a3 in vgPlain_get_StackTrace_wrk > > (tid_if_known=<optimized out>, > > > > ips=0x4030dcfc0, max_n_ips=100, sps=0x0, fps=0x0, > >startRegs=<optimized out>, fp_max_orig=34342965240) at m_stacktrace.c:309 > > > > #9 0x0000000038044509 in sync_signalhandler_from_kernel (uc=0x4030dd780, > > info=<optimized out>, > > > > sigNo=11, tid=1) at m_signals.c:2413 > > > > #10 sync_signalhandler (sigNo=11, info=<optimized out>, uc=0x4030dd780) > > at m_signals.c:2470 |