Hello all, 

Up to this point I've been using task_smpl with libpfm4 to get process-wide performance counters, but I would like to be able to get counts for specific regions of interest. I was trying to search the archives and found a relevant feature for perfmon2, but seemingly nothing for libpfm4. 

Most of the things that I want to run these types of region of interest tests on are my own little toy programs, so I can easily augment them however I please. What I was thinking was just to setup a semaphore that will start out locked. The parent process in task_smpl will just sem_wait() right before the core loop, and the child will unlock the semaphore when it gets to the region of interest. 

I'm new to writing IPC code. I've written a small example that behaves as described above (where the parent doesn't continue executing until the child gets to a certain point), but I'm not sure if there are any pitfalls that I could be missing here. Does this sound like a feasible strategy? Is there some easier way to achieve what I am looking to do?