|
From: Mike L. <mik...@gm...> - 2017-02-09 00:56:23
|
I'm working on a project that leverages Callgrind to generate VEX IR
traces. I'm using Valgrind 3.12.0.
I also use Callgrind's infrastructure to detect when Valgrind switches
thread contexts, however I'm getting unexpected behavior.
It looks like the best place to detect a thread context switch in Callgrind
is in CLG_(setup_bbcc) in bbcc.c (line 561):
/* This is needed because thread switches can not reliable be tracked
* with callback CLG_(run_thread) only: we have otherwise no way to get
* the thread ID after a signal handler returns.
* This could be removed again if that bug is fixed in Valgrind.
* This is in the hot path but hopefully not to costly.
*/
tid = VG_(get_running_tid)();
#if 1
/* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
* As this is on the hot path, we only call CLG_(switch_thread)(tid)
* if tid differs from the CLG_(current_tid).
*/
if (UNLIKELY(tid != CLG_(current_tid)))
CLG_(switch_thread)(tid);
The above is called every instrumented basic block.
I've noticed strange behavior, where* a thread switch would not always be
detected.*
I detected the unexpected behavior with the following modifications:
To investigate further, I modified the above:
- if (UNLIKELY(tid != CLG_(current_tid)))
+ if (UNLIKELY(tid != CLG_(current_tid))) {
CLG_(switch_thread)(tid);
+ VG_(printf)("Thread switched to: %d\n", tid);
+ }
- With this change, I run the parsec 3.0 benchmark blackscholes with 4
threads, input_test.tar, and expect to see *5 *threads (numbered 1-5, 1
master and 4 worker threads) printed.
- Under default flags, I'm seeing all 5 threads printed
- when I add --fair-sched=yes, often I'd see the last thread (5) *not
printed*.
- I confirmed this behavior by printing VG_(get_running_tid)() every
instrumented basic block.
- I know that the thread switch happened or else the application would
have failed.
This does not happen all the time but it happens on the majority of runs. I
also noticed that if I put a print statement in the blackscholes worker
thread, the unexpected behavior manifests far less often. I conclude it
must have something to do with the thread exiting too quickly and not
having enough work to do.
*Is this considered a bug? If not, how do I detect every time the Valgrind
thread context changes. I saw this thread
<http://valgrind-developers.narkive.com/ualztznb/thread-change-callback>from
a long time ago but I'm not sure if there's been any progress.*
$ uname -a
Linux ubuntu-VirtualBox 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24
21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
*Steps to reproduce:*
mkdir detect_thread_switch && cd detect_thread_switch
curl -L http://parsec.cs.princeton.edu/download/3.0/parsec-3.0-core.tar.gz
| tar xz
parsec-3.0/bin/parsecmgmt -a build -p blackscholes -c gcc-pthreads
tar xf parsec-3.0/pkgs/apps/blackscholes/inputs/input_test.tar
curl -L http://valgrind.org/downloads/valgrind-3.12.0.tar.bz2 | tar xj
*# MAKE THE CHANGE TO bbcc.c TO PRINT THREAD ID ON THREAD SWITCH*
cd valgrind-3.12.0 && ./autogen.sh && ./configure
make -j4 && cd ..
*# WILL SHOW THREADS 1-5*
valgrind-3.12.0/vg-in-place --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
*# MAY HAVE TO RUN SEVERAL TIMES IN SUCCESSION, WILL EVENTUALLY BE MISSING
THREAD 5*
valgrind-3.12.0/vg-in-place --fair-sched=yes --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
Thanks!
Mike
|