Re: [Dpcl-develop] RE: Some DPCL questions
Brought to you by:
dpcl-admin,
dwootton
|
From: Dave W. <dwo...@us...> - 2004-03-02 16:22:12
|
Steve
I looked at your test case and am still not totally sure what you are
trying to do. If I compile it the way you sent it, with the code to find
function 'mysleep' enabled, it works fine, although I did put a call to
sleep() in the mysleep function in the mutatee to slow it down so I could
connect to it.
If I change the mutator to look for __sleep, then it fails, stating it
can't find the symbol. I would expect it to fail, since the symbol __sleep
is defined only in libc, which is normally linked as a shared (dynamic)
library. Our version of BPatch/DPCL does not have the ability to find a
symbol defined anywhere other than in the target application's main
executable. We have no knowledge of any shared libraries that are loaded
into the mutatee process.
I did verify that DPCL can find the symbol __sleep by building a static
copy of libdpclRT.a and linking the mutatee with that library, using the
-static option and invoking the linker thru g++ rather than gcc (since
libdpclRT.a is C++ code). However my test caused a daemon crash, which may
be due to the setup on my system. I haven't looked into why.
As James noted, the hybrid DPCL which uses his Dyninst code is aware of
symbols in shared libraries, so it can find __sleep in the unmodified
testcase.
I'm also not getting any problems with the callback not running after 45
invocations, so we will need a log for that. One thing I did notice is
that if the dpcl daemon was left over from a previous run, then the
testcase would not run. Can you see if this occurs only when there is an
old dpcld daemon still running?
Dave
Steve Collins <sl...@sg...>
Sent by: dpc...@ww...
02/27/2004 12:37 PM
To: dpc...@ww...
cc: sl...@sg...
Subject: [Dpcl-develop] RE: Some DPCL questions
I'm really sorry about the e-mail mess. I think all our latest rif's
in the computer center are catching up with us. Oh well.
Good to hear from DaveW because I really didn't want to feel too all
alone with DPCL (heh,heh). I'll give my best responses to Dave's
questions and I guess we'll go from there.
1) DAVEW: There's no reason why callbacks should stop after 45 executions
of the probe expression. What happens here?
SLC: Everything runs to completion. No crashes. Nothing unusual. It's
just that the callback which is supposed to fire whenever I enter the
'sleep' program (actually 'mysleep', since 'sleep' has its own
problems as mentioned by Dave below) simply doesn't happen after the
45th trip thru 'mysleep'. Now I put a print in 'mysleep' to verify
that I AM in fact going thru 'mysleep' say, e.g., 5000 times, and
I am going thru 'mysleep' 5000 times. But the callback does not
fire off. BTW, the choice of 5000 is arbitary. Any number > 46
will result in just 46 callbacks. No more, no less. Any number < 46
works just fine. If I call 'mysleep', e.g. 17 times, then I get
17 callbacks and life is good. NOTE: 'mysleep' is just a simple dummy
routine that spins in a 'for' loop for a bit and then returns. Gave
up on actually calling 'sleep' since we surmised that 'sleep' is
signal-driven and that requires more DPCL expertise than I have at
this time.
DAVEW: Does DPCL just hang, crash, or execute normally?
SLC: I think I answered this one above, i.e. no crashes, everything
normal.
Just no callbacks after 46 trips thru 'mysleep'.
DAVEW: Probably the best thing to do is put a call to Ais_blog_on just
before
you set up the probe and see what is going on. We should see probe exp
activity in the DPCL log.
SLC: Thanks for the tip. I'll definitely give this a try. Of all my
known
gotchas right now, this 'callback' limit, or whatever, is the most
problematic because it kinda gets to the heart of DPCL. At least as I
understand it.
2) DAVEW: What happens with the unaligned references on ia64? Do you just
get a warning and DPCL continues, or does DPCL crash?
SLC: This is just a warning on ia64. A warning that performance is
taking
a hit and you should redo the code. Functionality is not affected.
DAVEW: If we track these down, we are going to have to do this on a
case
by case basis with most of them in the various mesaage pack and
unpack
functions where we are building messages between client and daemon.
I understand the reason ia64 is complaining about unaligned
references,
but this is the first hardware/software I have seen lately that seems
to
complain. If there are no serious side effects to the unaligned
references, is there possibly a function call that could be made to
turn
off this warning?
SLC: The minute I saw these hideous messages I sent mail to our
ia64apps
newsgroup and asked how I could turn the darn things off. I can't,
or so I'm told. Rewrite the code is about the only response I got.
Sigh.
3) DAVEW: I don't think you can do what you want, i.e. calling sleep()
with DPCL. The problem is that DPCL only knows about functions within
the target executable. It has no knowledge of functions in any shared
libraries. We get around this limitation in AIX by taking advantage of
how library calls are made in AIX applications. On AIX, a call to a
library function is made by calling a small stub module which gets
linked
with the application, and the stub then makes the call to the real
library function. Since the stubs are linked with the application, we
can find the stub and put pur probes there. Even with this solution,
DPCL can only call functions which are referenced by the application
already. If the application never calls sleep(), then DPCL cannot
build
a probe expression to call it since the stub doesn't exist.
On ia32 Linux we can't take advantage of this solution since the
generated
assembler code calls the library function directly. I suspect Linux on
ia64 works the same way.
SLC: My bad on this one. I didn't explain clearly what the dynamic
probe
is trying to do. The 'mutatee' IS calling 'sleep()' (or, for now, the
dummy 'mysleep'-see NOTE above) and we are just trying to use DPCL
(across a cluster. BTW I've got things working across a partitioned
Altix -not 'sleep' but the 'mysleep' mutator I describe above, even
though it stops at 45 whether across a cluster or not - congrats to
DPCL designers/developers - awesome stuff!!).. trying to use DPCL to
insert a simple little callback at the entry to 'sleep' (or, for now
'mysleep') and this callback in the mutator just increments a global
Count variable which is printed when 'Ais_end_main_loop' is called
upon clean termination. I suspect the 'signal-driven' nature of
'sleep' is problematic.
Anyway, my original query was essentially this: if my mutatee
calls 'sleep', why does the mutator have to look for '__sleep'
in the 'bexpand' phase when I am searching for the entry point
in the symbols? As best we can tell, '__sleep' is the hard external
and 'sleep' is a soft external, which is not found. Dyninst can
find 'sleep' but DPCL can only find '__sleep' and not the soft
version of 'sleep'.
Sorry to be so wordy, Dave. Like you have said many times before,
debugging by email is decidely non-optimal. But it's all I've been
able to arrange for thus far. Sigh.
SteveC - SGI Tools
Steve Collins <sl...@sg...>
Sent by: dpc...@ww...
02/26/2004 02:15 PM
To: dpc...@ww...
cc: sl...@sg..., per...@co...
Subject: [Dpcl-develop] Some DPCL questions
I've been having e-mail problems recently so I suspect my
postings here have not been received by everyone. So I am going
to re-post and hope for better results. I am currently trying to
address 3 DPCL issues as described below. Thanks as always to
DaveW, JamesW and others who might relieve my current clueless
state regarding any or all of these DPCL issues.
SteveC
SGI Tools
1.
I have a DPCL testcase (aka Dyninst mutator) that simply attempts
to insert a call to 'sleep' into the 'mutatee'. This seems to work
great until I ask for more than 45 calls to 'sleep'. Even if I ask
for 4000 calls or just 47 calls to sleep, I only get 45 'callbacks'
to occur to my mutator (or client or tool, etc.). It's like the
callback mechanism works for some limit of approximately 45 times.
I KNOW that I am entering the 'sleep' function 5000 times but only
45 callbacks from the instrumentation are occurring. Only 45 callbacks
occur if I just specify 47 calls to sleep. There <seems> to be a
barrier at '45' when it comes to the number of callbacks that will
or can occur. I have no clue on this one.
2.
Some of the DPCL code (example provided below) is causing the
IA64 hardware to emit 'unaligned access' errors to the screen.
An example of the code occurs in ~dpcl/src/lib/src/ModuleId.C
in routine 'ModuleId unpack_ModuleId(char **buffer)', to wit:
ModuleId
unpack_ModuleId(char **buffer)
{
char *data = *buffer;
char *uniqstr = data;
data = data + 1 + strlen(uniqstr); // don't forget the NULL character
int *uint_p = (int *) data;
data = data + sizeof(int);
ModuleId new_mid = ModuleId(uniqstr, *uint_p);
.....
This last statement which derefernces from an 'int' alignment (*uint_p)
seems to upset the ia64 hardware and I get something like this:
mutator(16567): unaligned access to 0x600000000000aa36,
ip=0x2000000000277bb0
Now this code can be rewritten to avoid this hardware complaint, but it
is probably a change that needs to be made in a number of places. Getting
changes accepted on a voluminous scale would seem to be problematic. Is
that true, or am I just being a little paranoid?
3.
I have a DPCL testcase (aka Dyninst mutator) that simply attempts
to insert a call to 'sleep' into the 'mutatee'. We have discovered that
'weak' symbols such as 'sleep' are not found by DPCL but they are found
by Dyninst. To wit:
The problem with sleep() rather than __sleep() is that the former is a
weak symbol:
[hope] /scratch/wdh/Test/DPCL: objdump -t /lib/libc.so.6.1 | grep "sleep"
...
0000000000160890 l F .text 00000000000003a0 __sleep
...
0000000000160890 w F .text 00000000000003a0 sleep
...
Now I'm not sure why Dyninst works fine with the weak sleep(), but DPCL
doesn't.
(Note: the above analysis by Bill Hachfeld at SGI)
DaveW/James - have you seen any problem of this sort in the past involving
'weak' symbols? For reminders, we are running on an ia64 machine.
_______________________________________________
Dpcl-develop mailing list
Dpc...@ww...
http://www-124.ibm.com/developerworks/oss/mailman/listinfo/dpcl-develop
_______________________________________________
Dpcl-develop mailing list
Dpc...@ww...
http://www-124.ibm.com/developerworks/oss/mailman/listinfo/dpcl-develop
|