Re: [Dpcl-develop] RE: Some DPCL questions
Brought to you by:
dpcl-admin,
dwootton
From: Dave W. <dwo...@us...> - 2004-02-27 21:19:03
|
Steve Can you send me the test case for your third question (sleep vs __sleep). I'll need the source for your target or an explanation of how to build a simple one and the source for your client. I'll find some time next week to look at it. You can send it directly to me rather than to the mailing list. Dave Steve Collins <sl...@sg...> Sent by: dpc...@ww... 02/27/2004 12:37 PM To: dpc...@ww... cc: sl...@sg... Subject: [Dpcl-develop] RE: Some DPCL questions I'm really sorry about the e-mail mess. I think all our latest rif's in the computer center are catching up with us. Oh well. Good to hear from DaveW because I really didn't want to feel too all alone with DPCL (heh,heh). I'll give my best responses to Dave's questions and I guess we'll go from there. 1) DAVEW: There's no reason why callbacks should stop after 45 executions of the probe expression. What happens here? SLC: Everything runs to completion. No crashes. Nothing unusual. It's just that the callback which is supposed to fire whenever I enter the 'sleep' program (actually 'mysleep', since 'sleep' has its own problems as mentioned by Dave below) simply doesn't happen after the 45th trip thru 'mysleep'. Now I put a print in 'mysleep' to verify that I AM in fact going thru 'mysleep' say, e.g., 5000 times, and I am going thru 'mysleep' 5000 times. But the callback does not fire off. BTW, the choice of 5000 is arbitary. Any number > 46 will result in just 46 callbacks. No more, no less. Any number < 46 works just fine. If I call 'mysleep', e.g. 17 times, then I get 17 callbacks and life is good. NOTE: 'mysleep' is just a simple dummy routine that spins in a 'for' loop for a bit and then returns. Gave up on actually calling 'sleep' since we surmised that 'sleep' is signal-driven and that requires more DPCL expertise than I have at this time. DAVEW: Does DPCL just hang, crash, or execute normally? SLC: I think I answered this one above, i.e. no crashes, everything normal. Just no callbacks after 46 trips thru 'mysleep'. DAVEW: Probably the best thing to do is put a call to Ais_blog_on just before you set up the probe and see what is going on. We should see probe exp activity in the DPCL log. SLC: Thanks for the tip. I'll definitely give this a try. Of all my known gotchas right now, this 'callback' limit, or whatever, is the most problematic because it kinda gets to the heart of DPCL. At least as I understand it. 2) DAVEW: What happens with the unaligned references on ia64? Do you just get a warning and DPCL continues, or does DPCL crash? SLC: This is just a warning on ia64. A warning that performance is taking a hit and you should redo the code. Functionality is not affected. DAVEW: If we track these down, we are going to have to do this on a case by case basis with most of them in the various mesaage pack and unpack functions where we are building messages between client and daemon. I understand the reason ia64 is complaining about unaligned references, but this is the first hardware/software I have seen lately that seems to complain. If there are no serious side effects to the unaligned references, is there possibly a function call that could be made to turn off this warning? SLC: The minute I saw these hideous messages I sent mail to our ia64apps newsgroup and asked how I could turn the darn things off. I can't, or so I'm told. Rewrite the code is about the only response I got. Sigh. 3) DAVEW: I don't think you can do what you want, i.e. calling sleep() with DPCL. The problem is that DPCL only knows about functions within the target executable. It has no knowledge of functions in any shared libraries. We get around this limitation in AIX by taking advantage of how library calls are made in AIX applications. On AIX, a call to a library function is made by calling a small stub module which gets linked with the application, and the stub then makes the call to the real library function. Since the stubs are linked with the application, we can find the stub and put pur probes there. Even with this solution, DPCL can only call functions which are referenced by the application already. If the application never calls sleep(), then DPCL cannot build a probe expression to call it since the stub doesn't exist. On ia32 Linux we can't take advantage of this solution since the generated assembler code calls the library function directly. I suspect Linux on ia64 works the same way. SLC: My bad on this one. I didn't explain clearly what the dynamic probe is trying to do. The 'mutatee' IS calling 'sleep()' (or, for now, the dummy 'mysleep'-see NOTE above) and we are just trying to use DPCL (across a cluster. BTW I've got things working across a partitioned Altix -not 'sleep' but the 'mysleep' mutator I describe above, even though it stops at 45 whether across a cluster or not - congrats to DPCL designers/developers - awesome stuff!!).. trying to use DPCL to insert a simple little callback at the entry to 'sleep' (or, for now 'mysleep') and this callback in the mutator just increments a global Count variable which is printed when 'Ais_end_main_loop' is called upon clean termination. I suspect the 'signal-driven' nature of 'sleep' is problematic. Anyway, my original query was essentially this: if my mutatee calls 'sleep', why does the mutator have to look for '__sleep' in the 'bexpand' phase when I am searching for the entry point in the symbols? As best we can tell, '__sleep' is the hard external and 'sleep' is a soft external, which is not found. Dyninst can find 'sleep' but DPCL can only find '__sleep' and not the soft version of 'sleep'. Sorry to be so wordy, Dave. Like you have said many times before, debugging by email is decidely non-optimal. But it's all I've been able to arrange for thus far. Sigh. SteveC - SGI Tools Steve Collins <sl...@sg...> Sent by: dpc...@ww... 02/26/2004 02:15 PM To: dpc...@ww... cc: sl...@sg..., per...@co... Subject: [Dpcl-develop] Some DPCL questions I've been having e-mail problems recently so I suspect my postings here have not been received by everyone. So I am going to re-post and hope for better results. I am currently trying to address 3 DPCL issues as described below. Thanks as always to DaveW, JamesW and others who might relieve my current clueless state regarding any or all of these DPCL issues. SteveC SGI Tools 1. I have a DPCL testcase (aka Dyninst mutator) that simply attempts to insert a call to 'sleep' into the 'mutatee'. This seems to work great until I ask for more than 45 calls to 'sleep'. Even if I ask for 4000 calls or just 47 calls to sleep, I only get 45 'callbacks' to occur to my mutator (or client or tool, etc.). It's like the callback mechanism works for some limit of approximately 45 times. I KNOW that I am entering the 'sleep' function 5000 times but only 45 callbacks from the instrumentation are occurring. Only 45 callbacks occur if I just specify 47 calls to sleep. There <seems> to be a barrier at '45' when it comes to the number of callbacks that will or can occur. I have no clue on this one. 2. Some of the DPCL code (example provided below) is causing the IA64 hardware to emit 'unaligned access' errors to the screen. An example of the code occurs in ~dpcl/src/lib/src/ModuleId.C in routine 'ModuleId unpack_ModuleId(char **buffer)', to wit: ModuleId unpack_ModuleId(char **buffer) { char *data = *buffer; char *uniqstr = data; data = data + 1 + strlen(uniqstr); // don't forget the NULL character int *uint_p = (int *) data; data = data + sizeof(int); ModuleId new_mid = ModuleId(uniqstr, *uint_p); ..... This last statement which derefernces from an 'int' alignment (*uint_p) seems to upset the ia64 hardware and I get something like this: mutator(16567): unaligned access to 0x600000000000aa36, ip=0x2000000000277bb0 Now this code can be rewritten to avoid this hardware complaint, but it is probably a change that needs to be made in a number of places. Getting changes accepted on a voluminous scale would seem to be problematic. Is that true, or am I just being a little paranoid? 3. I have a DPCL testcase (aka Dyninst mutator) that simply attempts to insert a call to 'sleep' into the 'mutatee'. We have discovered that 'weak' symbols such as 'sleep' are not found by DPCL but they are found by Dyninst. To wit: The problem with sleep() rather than __sleep() is that the former is a weak symbol: [hope] /scratch/wdh/Test/DPCL: objdump -t /lib/libc.so.6.1 | grep "sleep" ... 0000000000160890 l F .text 00000000000003a0 __sleep ... 0000000000160890 w F .text 00000000000003a0 sleep ... Now I'm not sure why Dyninst works fine with the weak sleep(), but DPCL doesn't. (Note: the above analysis by Bill Hachfeld at SGI) DaveW/James - have you seen any problem of this sort in the past involving 'weak' symbols? For reminders, we are running on an ia64 machine. _______________________________________________ Dpcl-develop mailing list Dpc...@ww... http://www-124.ibm.com/developerworks/oss/mailman/listinfo/dpcl-develop _______________________________________________ Dpcl-develop mailing list Dpc...@ww... http://www-124.ibm.com/developerworks/oss/mailman/listinfo/dpcl-develop |