|
From: Masami H. <mhi...@re...> - 2010-05-12 17:43:52
|
Mathieu Desnoyers wrote: > * Masami Hiramatsu (mhi...@re...) wrote: >> Mathieu Desnoyers wrote: >>> * Masami Hiramatsu (mhi...@re...) wrote: >>>> Use text_poke_smp_batch() in optimization path for reducing >>>> the number of stop_machine() issues. >>>> >>>> Signed-off-by: Masami Hiramatsu <mhi...@re...> >>>> Cc: Ananth N Mavinakayanahalli <an...@in...> >>>> Cc: Ingo Molnar <mi...@el...> >>>> Cc: Jim Keniston <jke...@us...> >>>> Cc: Jason Baron <jb...@re...> >>>> Cc: Mathieu Desnoyers <mat...@ef...> >>>> --- >>>> >>>> arch/x86/kernel/kprobes.c | 37 ++++++++++++++++++++++++++++++------- >>>> include/linux/kprobes.h | 2 +- >>>> kernel/kprobes.c | 13 +------------ >>>> 3 files changed, 32 insertions(+), 20 deletions(-) >>>> >>>> diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c >>>> index 345a4b1..63a5c24 100644 >>>> --- a/arch/x86/kernel/kprobes.c >>>> +++ b/arch/x86/kernel/kprobes.c >>>> @@ -1385,10 +1385,14 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op) >>>> return 0; >>>> } >>>> >>>> -/* Replace a breakpoint (int3) with a relative jump. */ >>>> -int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op) >>>> +#define MAX_OPTIMIZE_PROBES 256 >>> >>> So what kind of interrupt latency does a 256-probes batch generate on the >>> system ? Are we talking about a few milliseconds, a few seconds ? >> >> From my experiment on kvm/4cpu, it took about 3 seconds in average. > > That's 3 seconds for multiple calls to stop_machine(). So we can expect > latencies in the area of few microseconds for each call, right ? Theoretically yes. But if we register more than 1000 probes at once, it's hard to do anything except optimizing a while(more than 10 sec), because it stops machine so frequently. >> With this patch, it went down to 30ms. (x100 faster :)) > > This is beefing up the latency from few microseconds to 30ms. It sounds like a > regression rather than a gain to me. If it is not acceptable, I can add a knob for control how many probes optimize/unoptimize at once. Anyway, it is expectable latency (after registering/unregistering probes) and it will be small if we put a few probes. (30ms is the worst case) And if you want, it can be disabled by sysctl. Thank you, -- Masami Hiramatsu e-mail: mhi...@re... |