|
From: Charlie S. <csh...@ho...> - 2004-10-19 17:58:08
|
algrind gurus,
I've been using Valgrind 2.2.0 without any problems. We just started
using Oracle 9 and now my applications don't run with Valgrind 2.2.0.
They run OK with Valgrind 2.0.0. I get the following and then Valgrind
teminates:
==10644== warning: Valgrind's pthread_attr_destroy does nothing
==10644== your program may misbehave as a result
==10644== warning: Valgrind's pthread_attr_destroy does nothing
==10644== your program may misbehave as a result
==10644== Thread 3:
==10644== Invalid read of size 4
==10644== at 0x28D00003: osncon (in
/home/ie/oracle_9.2.04/lib/libclntsh.so.9.0)
==10644== by 0x28AE9EAD: kpuadef (in
/home/ie/oracle_9.2.04/lib/libclntsh.so.9.0)
==10644== by 0x28BB1668: upiini (in
/home/ie/oracle_9.2.04/lib/libclntsh.so.9.0)
==10644== by 0x28B93204: upiah0 (in
/home/ie/oracle_9.2.04/lib/libclntsh.so.9.0)
==10644== Address 0x294CEE is not stack'd, malloc'd or (recently) free'd
*** CallStack: package name abs PC function
*** ---------------------------------------------------------------
*** libcsutilpkg.so 0x1C73DF25 SignalHandler
*** unknown 0x52BFF000
*** ibclntsh.so.9.0 0x28AE9EAE kpuadef
*** ibclntsh.so.9.0 0x28BB1669 upiini
*** ibclntsh.so.9.0 0x28B93205 upiah0
*** ibclntsh.so.9.0 0x28AE9A79 kpuatch
*** ibclntsh.so.9.0 0x28B7A09F OCIServerAttach
*** ibclntsh.so.9.0 0x28AAB67C
*** ibclntsh.so.9.0 0x28AABF27 sqllam
*** ibclntsh.so.9.0 0x28AB4D48 sqllo3t
*** ibclntsh.so.9.0 0x28AB25CE
*** ibclntsh.so.9.0 0x28AB4286 sqlexp
*** ibclntsh.so.9.0 0x28AAD2CD
*** ibclntsh.so.9.0 0x28AAD950 sqlcxt
*** ib/libdbipkg.so 0x1CFFBB7D rdb_SysConnect
*** libcsutilpkg.so 0x1C73E375 utl_Calls
*** ib/libdbipkg.so 0x1CFE1D9B rdb_Listener
*** ib/libovmpkg.so 0x1C70B49B
*** libpthread.so.0 0x2875DA1C
==10644== warning: Valgrind's siglongjmp is incomplete
==10644== (it ignores cleanup handlers)
==10644== your program may misbehave as a result
OVM PROCESS: OracleListener
PID: 10644
*F* OVM Default Handler captured the following exception:
SIGNAL-FATAL-UNIX_SIGSEGV
Thread: OracleListener Terminates, Execution Continues
sched status:
Thread 1: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x293F999C
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
Thread 2: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x293FA48C
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
Thread 3: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x1C71F900
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C70E2FD: ovm_hibernate (ovm_time.cc:933)
==10644== by 0x1C70BC1C: ovm_scheduler (ovm_process.cc:2119)
==10644== by 0x1C70A99B: ovm_KillProcessWithStatus (ovm_process.cc:711)
==10644==
==10644== Warning: pthread scheduler exited due to deadlock
==10644== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 379 from 1)
==10644== malloc/free: in use at exit: 1425242 bytes in 9106 blocks.
==10644== malloc/free: 9293 allocs, 187 frees, 1528447 bytes allocated.
==10644== For a detailed leak analysis, rerun with: --leak-check=yes
==10644== For counts of detected errors, rerun with: -v
valgrind: vg_main.c:3098 (main): Assertion `src == VgSrc_FatalSig ||
vgPlain_threads[last_run_tid].status == VgTs_Runnable ||
vgPlain_threads[last_run_tid].status == VgTs_WaitJoiner' failed.
==10644== at 0xB002A4D7: vgPlain_skin_assert_fail (vg_mylibc.c:1137)
==10644== by 0xB002A4D6: assert_fail (vg_mylibc.c:1133)
==10644== by 0xB002A514: vgPlain_core_assert_fail (vg_mylibc.c:1144)
==10644== by 0xB0025C3A: main (vg_main.c:3126)
sched status:
Thread 1: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x293F999C
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
Thread 2: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x293FA48C
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
Thread 3: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
0x1C71F900
==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
==10644== by 0x1C70E2FD: ovm_hibernate (ovm_time.cc:933)
==10644== by 0x1C70BC1C: ovm_scheduler (ovm_process.cc:2119)
==10644== by 0x1C70A99B: ovm_KillProcessWithStatus (ovm_process.cc:711)
The same application, but one that uses Oracle 8 runs OK with
Valgrind 2.2.0.
The code at the top of the stack trace is in an Oracle library,
libclntsh.so.9.0, so I don't have access to it.
Is this problem probably due to the siglongjmp warning above?
Has anyone else had a similar problem?
Thanks,
Charlie Shelton
|
|
From: Tom H. <th...@cy...> - 2004-10-19 18:46:17
|
In message <417...@ho...>
Charlie Shelton <csh...@ho...> wrote:
> sched status:
>
> Thread 1: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
> 0x293F999C
> ==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
> ==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
> ==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
> ==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
>
> Thread 2: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
> 0x293FA48C
> ==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
> ==10644== by 0x1C7071C2: ovm_SwitchContext (ovm_context.c:150)
> ==10644== by 0x1C70BDD0: ovm_scheduler (ovm_process.cc:2178)
> ==10644== by 0x1C7088F1: ovm_AwaitEvent (ovm_event.cc:385)
>
> Thread 3: status = WaitCV, associated_mx = 0x1C71F8A0, associated_cv =
> 0x1C71F900
> ==10644== at 0x2875EEEC: pthread_cond_wait (vg_libpthread.c:1454)
> ==10644== by 0x1C70E2FD: ovm_hibernate (ovm_time.cc:933)
> ==10644== by 0x1C70BC1C: ovm_scheduler (ovm_process.cc:2119)
> ==10644== by 0x1C70A99B: ovm_KillProcessWithStatus (ovm_process.cc:711)
>
> ==10644==
> ==10644== Warning: pthread scheduler exited due to deadlock
What valgrind is saying is that the process is deadlocked because
all the threads are waiting on a condition variable and there is no
thread left to signal any of those condition variables.
Technically speaking there is a way out of this if there is a signal
handler installed that will be triggered by something and then do a
longjmp to break the deadlock, but valgrind doesn't consider that at
the moment. This is something that I only noticed the other day.
Is that actually the case in your application? Are you expecting a
signal to break that dead lock?
I'm not sure why this doesn't happen with 2.0.0 or with the Oracle 8
library as all the above routines appear to be in your code.
I certainly use valgrind on programs that use Oracle 9 without
any problems.
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|