Re: [Sablevm-developer] sablevm hang
Brought to you by:
egagnon
From: Chris P. <chr...@ma...> - 2004-06-18 18:35:02
|
Hi Joe, (Etienne, can you read and comment?) So, if you look at gdb stacktraces for each process, you can see exactly how it's deadlocking: $ ps -A u ... cpicke 455 1.0 0.1 25456 3436 pts/0 S 13:31 0:00 sablevm-switch-debug AwtUtils2 cpicke 457 0.0 0.1 25456 3436 pts/0 S 13:31 0:00 sablevm-switch-debug AwtUtils2 cpicke 458 0.0 0.1 25456 3436 pts/0 S 13:31 0:00 sablevm-switch-debug AwtUtils2 ... It seems from instruction traces that actually 455 is the MyThread thread, 457 I'm unclear about, and 458 is the AwtUtils2.main() thread. What's confusing is that the AwtUtils2.main() thread has thread.id == 2, whereas the MyThread has thread.id == 1. I put a 'System.out.println("hello");' in MyThread to be sure -- you only see the calls to it in T1 (grep println AwtUtils2.log). http://www.sable.mcgill.ca/~cpicke/sablevm/AwtUtils2.java http://www.sable.mcgill.ca/~cpicke/sablevm/AwtUtils2.log (69M) If I understand correctly, the AwtUtils2.main() thread starts the MyThread, waits for it to die, and in the meantime the MyThread tries to exit the VM itself, because of a mishandled: getToolkit().getSystemEventQueue().postEvent(we); in java.awt.Window.dispose() like you said. $ gdb sablevm-switch-debug 455 ... (gdb) bt #0 0x4017e87e in sigsuspend () from /lib/libc.so.6 #1 0x40103879 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0 #2 0x40100102 in pthread_cond_wait () from /lib/libpthread.so.0 #3 0x400badac in DestroyJavaVM (_vm=0x804cad0) at invoke_interface.c:474 #4 0x0804a9a8 in main (argc=2, argv=0xbffff664) at sablevm.c:1469 ... which gets stuck here: /* wait for all non-deamon threads to die */ while (vm->threads.user != NULL) { _svmm_cond_wait (vm->threads.vm_destruction_cond, vm->global_mutex); } --> again, this is the MyThread $ gdb sablevm-switch-debug 457 ... (gdb) bt #0 0x4021dbb0 in poll () from /lib/libc.so.6 #1 0x40100d96 in __pthread_manager () from /lib/libpthread.so.0 #2 0x40224d6a in clone () from /lib/libc.so.6 $ gdb sablevm-switch-debug 458 ... (gdb) bt #0 0x4017e87e in sigsuspend () from /lib/libc.so.6 #1 0x40103879 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0 #2 0x40100102 in pthread_cond_wait () from /lib/libpthread.so.0 #3 0x400c7c0a in Java_java_lang_VMObject_wait (_env=0x806a7a8, class=0x806a320, o=0x806a310, ms=0, ns=0) at java_lang_VMObject.c:261 #4 0x40131827 in ffi_call_SYSV () at /tmp/ccBwuNHj.s:40 #5 0x401314a7 in ffi_call (cif=0x8, fn=0xfffffffc, rvalue=0xbf7ffc88, avalue=0xfffffffc) at ../../../gcc-3.3.2/libffi/src/x86/ffi.c:194 #6 0x40035df7 in _svmf_invoke_native_static (env=0x806a7a8) at native.c:850 #7 0x40071cd2 in _svmf_interpreter (_env=0x806a7a8) at instructions_switch.c:19728 #8 0x4002ac2c in _svmh_invoke_static_virtualmachine_runthread (env=0x806a7a8) at method_invoke.c:5777 #9 0x400203ef in _svmf_thread_start (_env=0x806a7a8) at thread.c:1521 #10 0x401010ba in pthread_start_thread () from /lib/libpthread.so.0 #11 0x40224d6a in clone () from /lib/libc.so.6 which gets stuck here: if (ms == 0 && ns == 0) { _svmm_cond_wait (fat_lock->notification_cond, fat_lock->mutex); } --> this is the AwtUtils2.main() thread (AFAICT) Anyway, I don't know what the solution is. I originally thought broadcasting to all sleeping threads when trying to exit the VM was the answer, but now I'm not sure. Cheers, Chris |