From: Hefty, S. <sea...@in...> - 2004-02-26 21:59:28
|
It's take us a while, but do know of this issue and believe that we have identified its cause. The problem is that dlopen acquires a mutex internally then tries to initialize ibal. The initialization of ibal spawns a thread for additional initialization, then waits on an event that is signaled by the spawned thread. Part of the processing done by the spawned thread is to call dlopen on the user-level VPD, and it does this before signaling the event mentioned before. This call to dlopen hangs, as the first thread is holding the internal mutex. We have a possible fix for this that we are testing. > -----Original Message----- > From: inf...@li... > [mailto:inf...@li...] On Behalf Of > Hassan M. Jafri > Sent: Tuesday, January 27, 2004 11:31 AM > To: inf...@li... > Subject: [Infiniband-access_layer] dlopen hanging with alllib >=20 > Hardware: > Intel IA-32 Xeon > Mellanox a1 silicon HCA >=20 > Software: > SourceForge Alpha2 release for SDK 1.00 BK 1.163 > thca-x86-thca_1_0_release-build-011 >=20 >=20 > I have created a dynamically loadable library (call it ibal.so) that > contains all the code with talks to the ibal layer. ibal.so is linked > dynamacally with allib and complib. I load my ibal.so library dynamically > using dlopen in my program (call it "test") . The problem is that my > program hangs when dlopen is issued. Appended below is the backtrace of > the > thread that seems to be hanging. >=20 > This problem goes away when I link "test" allib.so and complib.so. In that > case, by the time dlopen is issued for ibal.so, complib and allib symbols > are already in the addresss space of "test", and that somehow keeps the > hang for occurring. >=20 >=20 >=20 > ************************************************************************ ** > *********** > #0 0x401206a8 in sigsuspend () from /lib/libc.so.6 > #1 0x400adc28 in __pthread_wait_for_restart_signal () > from /lib/libpthread.so.0 > #2 0x400a9f9b in pthread_cond_wait@GLIBC_2.0 () from /lib/libpthread.so > #3 0x40450814 in cl_event_wait_on () from /usr/lib/libcomplib.so.0.0 > #4 0x40446208 in create_al_mgr () from /usr/lib/liballib.so.0.0 > #5 0x40448641 in ual_init () from /usr/lib/liballib.so.0.0 > #6 0x404484fb in _init () from /usr/lib/liballib.so.0.0 > #7 0x4000c4b1 in _dl_init_internal () from /lib/ld-linux.so.2 > #8 0x4020af42 in dl_open_worker () from /lib/libc.so.6 > #9 0x4000c266 in _dl_catch_error_internal () from /lib/ld-linux.so.2 > #10 0x4020a9af in _dl_open () from /lib/libc.so.6 > #11 0x40081eeb in dlopen_doit () from /lib/libdl.so.2 > #12 0x4000c266 in _dl_catch_error_internal () from /lib/ld-linux.so.2 > #13 0x40081316 in _dlerror_run () from /lib/libdl.so.2 > #14 0x40081e92 in dlopen@GLIBC_2.0 () from /lib/libdl.so.2 > #15 0x40021db4 in VMI_Load_Device (info=3D0x80cc0d8, newDevice=3D0x80c9c78) > at vmidevmgr_utils.c:86 > #16 0x400218a2 in VMI_Device_Register (info=3D0x80cc0d8, device=3D0xbfffde64 > at vmidevmgr.c:222 > #17 0x40025225 in VMI_XMLParser_Register (handle=3D0x80cc9a0) > at xmlparser.c:827 > #18 0x40023cda in VMI_Init_Subsystems (argc=3D1, argv=3D0xbfffdf54) > at vmicore_utils.c:167 > #19 0x4002396f in VMI_Init (argc=3D1, argv=3D0xbfffdf54) at vmi.c:55 > #20 0x0804a0e8 in main (argc=3D1, argv=3D0xbfffdf54) at = bandwidth.c:949 > #21 0x4010d917 in __libc_start_main () from /lib/libc.so.6 >=20 >=20 >=20 > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Infiniband-access_layer mailing list > Inf...@li... > https://lists.sourceforge.net/lists/listinfo/infiniband-access_layer |