From: Josh T. <jo...@jo...> - 2014-03-10 04:21:20
|
This patch series makes it possible to disable HW_BREAKPOINTS, PERF_EVENTS, INSTRUCTION_DECODER, IRQ_WORK, and ANON_INODES. Without this patch series, all of these config options get automatically selected on x86, making them impossible to disable. This is a revival of a previous patch series from Andi Kleen sent in October 2013. I rebased it to apply to the latest kernel, fixed issues raised in response to the original patch (in particular re-collecting size deltas), and added one additional patch to support disabling ANON_INODES. Overall, this patch series saves 162k with allnoconfig. (I used a minimal allnoconfig generated by the allnoconfig from -mm and -next, which includes my patch to disable everything behind EMBEDDED/EXPERT/DEBUG_KERNEL as well.) Here's the updated size data included in the revised commit message for patch 6: text data bss dec hex filename 788816 131952 1214432 2135200 2094a0 vmlinux-allnoconfig-with-perf 666361 93020 1214368 1973749 1e1df5 vmlinux-allnoconfig-no-perf bloat-o-meter summary: add/remove: 1/1040 grow/shrink: 1/25 up/down: 82/-152214 (-152132) For the additional savings from ANON_INODES, see the last patch's commit message. Here's the exact diff between the two configurations: ~/src/linux$ diff -U0 dotconfig-perf-events dotconfig-no-perf-events --- dotconfig-perf-events 2014-03-09 18:58:28.280363186 -0700 +++ dotconfig-no-perf-events 2014-03-09 18:32:40.192726592 -0700 @@ -8 +7,0 @@ -CONFIG_INSTRUCTION_DECODER=y @@ -39 +37,0 @@ -CONFIG_IRQ_WORK=y @@ -115 +112,0 @@ -CONFIG_ANON_INODES=y @@ -139,2 +136 @@ -CONFIG_PERF_EVENTS=y -# CONFIG_DEBUG_PERF_USE_VMALLOC is not set +# CONFIG_PERF_EVENTS is not set @@ -148 +143,0 @@ -CONFIG_OPROFILE_NMI_TIMER=y @@ -164,2 +158,0 @@ -CONFIG_HAVE_HW_BREAKPOINT=y -CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y @@ -211,0 +205 @@ +# CONFIG_HW_BREAKPOINTS is not set Responding to the concerns raised with the previous version of the patch: - This version accurately measures and documents the size difference specifically attributable to disabling these configuration options, without including anything that could already be disabled either way. Also, I gathered the more relevant data from allnoconfig, rather than from defconfig, since a configuration disabling these options is almost certainly closer to a minimal configuration rather than a defconfig. - This version marks the new CONFIG_HW_BREAKPOINTS option as EXPERT, to ensure that only people prepared to deal with the ABI implications will disable it. This should address the concerns raised about potentially breaking applications or ABI; the kernel already has numerous other options to disable syscalls and other userspace interfaces, but they're generally hidden behind EXPERT, and now this option is as well. - It might be nice in the future to support configurations with CONFIG_HW_BREAKPOINTS=y and !CONFIG_PERF_EVENTS. However, considering that the kernel currently doesn't support *either* !CONFIG_HW_BREAKPOINTS or !CONFIG_PERF_EVENTS, this seems like an improvement, and it provides a major reduction in kernel size. For the embedded use case, I'd like to disable both. Future patches could refactor perf to provide a subset of infrastructure usable to enable CONFIG_HW_BREAKPOINTS, or otherwise allow even more flexible configurations. Andi Kleen (6): perf, x86, amd: Move __get_ibs_caps into common amd CPU file perf, x86: Make perf amd ibs code depend on PERF_EVENTS and SUP_AMD x86, ptrace: Ifdef HW_BREAKPOINTS code in ptrace trace: Make UPROBES depend on PERF_EVENTS x86, kgdb: Support compiling without hardware break points x86: Allow disabling HW_BREAKPOINTS, PERF_EVENTS, INSTRUCTION_DECODER, IRQ_WORK Josh Triplett (1): x86: Stop selecting ANON_INODES arch/x86/Kconfig | 13 +++++++++---- arch/x86/include/asm/perf_event.h | 2 ++ arch/x86/kernel/Makefile | 3 ++- arch/x86/kernel/cpu/Makefile | 5 ++++- arch/x86/kernel/cpu/amd.c | 27 +++++++++++++++++++++++++++ arch/x86/kernel/cpu/perf_event_amd_ibs.c | 23 ----------------------- arch/x86/kernel/kgdb.c | 12 ++++++++++++ arch/x86/kernel/ptrace.c | 30 ++++++++++++++++++++++++++++++ arch/x86/kvm/Kconfig | 1 + arch/x86/oprofile/op_model_amd.c | 4 +++- kernel/trace/Kconfig | 1 + 11 files changed, 91 insertions(+), 30 deletions(-) -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:22:10
|
From: Andi Kleen <ak...@li...> __get_ibs_caps is used by both oprofile and perf events. This causes oprofile to be dependent on perf. __get_ibs_caps is actually only a few lines, so we can easily move that to the standard AMD CPU initialization file. It will be always compiled in this way, but it's far better than requiring perf always to be compiled in for this. Cc: bp...@su... Signed-off-by: Andi Kleen <ak...@li...> --- arch/x86/include/asm/perf_event.h | 2 ++ arch/x86/kernel/cpu/amd.c | 27 +++++++++++++++++++++++++++ arch/x86/kernel/cpu/perf_event_amd_ibs.c | 23 ----------------------- arch/x86/oprofile/op_model_amd.c | 4 +++- 4 files changed, 32 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 8249df4..866b0f3 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -198,6 +198,8 @@ struct x86_pmu_capability { #define IBS_OP_MAX_CNT_EXT 0x007FFFFFULL /* not a register bit mask */ #define IBS_RIP_INVALID (1ULL<<38) +extern u32 __get_ibs_caps(void); + #ifdef CONFIG_X86_LOCAL_APIC extern u32 get_ibs_caps(void); #else diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index c67ffa6..528af43 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -15,6 +15,8 @@ # include <asm/cacheflush.h> #endif +#include <asm/perf_event.h> + #include "cpu.h" static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p) @@ -911,3 +913,28 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum) return false; } + + +/* IBS - apic initialization, for perf and oprofile */ + +u32 __get_ibs_caps(void) +{ + u32 caps; + unsigned int max_level; + + if (!boot_cpu_has(X86_FEATURE_IBS)) + return 0; + + /* check IBS cpuid feature flags */ + max_level = cpuid_eax(0x80000000); + if (max_level < IBS_CPUID_FEATURES) + return IBS_CAPS_DEFAULT; + + caps = cpuid_eax(IBS_CPUID_FEATURES); + if (!(caps & IBS_CAPS_AVAIL)) + /* cpuid flags not valid */ + return IBS_CAPS_DEFAULT; + + return caps; +} +EXPORT_SYMBOL(__get_ibs_caps); diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c index 4b8e4d3..ce2cd75 100644 --- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c +++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c @@ -665,29 +665,6 @@ static __init int perf_event_ibs_init(void) { return 0; } #endif -/* IBS - apic initialization, for perf and oprofile */ - -static __init u32 __get_ibs_caps(void) -{ - u32 caps; - unsigned int max_level; - - if (!boot_cpu_has(X86_FEATURE_IBS)) - return 0; - - /* check IBS cpuid feature flags */ - max_level = cpuid_eax(0x80000000); - if (max_level < IBS_CPUID_FEATURES) - return IBS_CAPS_DEFAULT; - - caps = cpuid_eax(IBS_CPUID_FEATURES); - if (!(caps & IBS_CAPS_AVAIL)) - /* cpuid flags not valid */ - return IBS_CAPS_DEFAULT; - - return caps; -} - u32 get_ibs_caps(void) { return ibs_caps; diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c index 50d86c0..964171f 100644 --- a/arch/x86/oprofile/op_model_amd.c +++ b/arch/x86/oprofile/op_model_amd.c @@ -26,6 +26,8 @@ #include <asm/processor.h> #include <asm/cpufeature.h> +#include <asm/perf_event.h> /* for __get_ibs_caps */ + #include "op_x86_model.h" #include "op_counter.h" @@ -446,7 +448,7 @@ static void op_amd_stop(struct op_msrs const * const msrs) static void init_ibs(void) { - ibs_caps = get_ibs_caps(); + ibs_caps = __get_ibs_caps(); if (!ibs_caps) return; -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:22:10
|
From: Andi Kleen <ak...@li...> Add some ifdefs to support compiling without CONFIG_HW_BREAKPOINTS. HW_BREAKPOINTS pulls in all of perf, and presumably there are some use cases where people want to debug without perf. The standard software breakpoints still work of course. Cc: jas...@wi... Signed-off-by: Andi Kleen <ak...@li...> --- arch/x86/kernel/kgdb.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c index 7ec1d5f..fb85ea7 100644 --- a/arch/x86/kernel/kgdb.c +++ b/arch/x86/kernel/kgdb.c @@ -192,6 +192,8 @@ void sleeping_thread_to_gdb_regs(unsigned long *gdb_regs, struct task_struct *p) gdb_regs[GDB_SP] = p->thread.sp; } +#ifdef CONFIG_HW_BREAKPOINTS + static struct hw_breakpoint { unsigned enabled; unsigned long addr; @@ -418,6 +420,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs) } } +#endif + #ifdef CONFIG_SMP /** * kgdb_roundup_cpus - Get other CPUs into a holding pattern @@ -638,6 +642,8 @@ out: return retval; } +#ifdef CONFIG_HW_BREAKPOINTS + static void kgdb_hw_overflow_handler(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs) { @@ -688,6 +694,8 @@ void kgdb_arch_late(void) } } +#endif + /** * kgdb_arch_exit - Perform any architecture specific uninitalization. * @@ -696,6 +704,7 @@ void kgdb_arch_late(void) */ void kgdb_arch_exit(void) { +#ifdef CONFIG_HW_BREAKPOINTS int i; for (i = 0; i < 4; i++) { if (breakinfo[i].pev) { @@ -703,6 +712,7 @@ void kgdb_arch_exit(void) breakinfo[i].pev = NULL; } } +#endif unregister_nmi_handler(NMI_UNKNOWN, "kgdb"); unregister_nmi_handler(NMI_LOCAL, "kgdb"); unregister_die_notifier(&kgdb_notifier); @@ -806,9 +816,11 @@ struct kgdb_arch arch_kgdb_ops = { /* Breakpoint instruction: */ .gdb_bpt_instr = { 0xcc }, .flags = KGDB_HW_BREAKPOINT, +#ifdef CONFIG_HW_BREAKPOINTS .set_hw_breakpoint = kgdb_set_hw_break, .remove_hw_breakpoint = kgdb_remove_hw_break, .disable_hw_break = kgdb_disable_hw_debug, .remove_all_hw_break = kgdb_remove_all_hw_break, .correct_hw_break = kgdb_correct_hw_break, +#endif }; -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:22:53
|
From: Andi Kleen <ak...@li...> Make HW_BREAKPOINTS a config option. HW_BREAKPOINTS depends on perf. This allows disabling PERF_EVENTS for systems that don't need it (e.g. anything not used for development) This then allows disabling IRQ_WORK and INSTRUCTION_DECODER. This change reduces the size of allnoconfig by 161k, a full 7.6% reduction: text data bss dec hex filename 788816 131952 1214432 2135200 2094a0 vmlinux-allnoconfig-with-perf 666361 93020 1214368 1973749 1e1df5 vmlinux-allnoconfig-no-perf bloat-o-meter summary: add/remove: 1/1040 grow/shrink: 1/25 up/down: 82/-152214 (-152132) Suggested-by: Ingo Molnar <mi...@ke...> Signed-off-by: Andi Kleen <ak...@li...> [josh: Rebased; HW_BREAKPOINTS marked as EXPERT; commit message updated with new statistics.] Signed-off-by: Josh Triplett <jo...@jo...> --- arch/x86/Kconfig | 12 +++++++++--- arch/x86/kernel/Makefile | 3 ++- arch/x86/kvm/Kconfig | 1 + 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0af5250..12e3386 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -70,9 +70,6 @@ config X86 select HAVE_KERNEL_XZ select HAVE_KERNEL_LZO select HAVE_KERNEL_LZ4 - select HAVE_HW_BREAKPOINT - select HAVE_MIXED_BREAKPOINTS_REGS - select PERF_EVENTS select HAVE_PERF_EVENTS_NMI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP @@ -587,6 +584,15 @@ config SCHED_OMIT_FRAME_POINTER If in doubt, say "Y". +config HW_BREAKPOINTS + bool "Enable hardware breakpoints support" if EXPERT + default y + select HAVE_HW_BREAKPOINT + select HAVE_MIXED_BREAKPOINTS_REGS + ---help--- + Enable support for x86 hardware breakpoints for debuggers + and perf. This will implicitly enable perf-events. + menuconfig HYPERVISOR_GUEST bool "Linux guest support" ---help--- diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index cb648c8..ecbf6fb 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -32,7 +32,7 @@ obj-$(CONFIG_X86_64) += vsyscall_emu_64.o obj-$(CONFIG_SYSFS) += ksysfs.o obj-y += bootflag.o e820.o obj-y += pci-dma.o quirks.o topology.o kdebugfs.o -obj-y += alternative.o i8253.o pci-nommu.o hw_breakpoint.o +obj-y += alternative.o i8253.o pci-nommu.o obj-y += tsc.o tsc_msr.o io_delay.o rtc.o obj-y += pci-iommu_table.o obj-y += resource.o @@ -76,6 +76,7 @@ obj-$(CONFIG_DOUBLEFAULT) += doublefault.o obj-$(CONFIG_KGDB) += kgdb.o obj-$(CONFIG_VM86) += vm86_32.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o +obj-$(CONFIG_HW_BREAKPOINTS) += hw_breakpoint.o obj-$(CONFIG_HPET_TIMER) += hpet.o obj-$(CONFIG_APB_TIMER) += apb_timer.o diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 287e4c8..d753291 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -36,6 +36,7 @@ config KVM select TASKSTATS select TASK_DELAY_ACCT select PERF_EVENTS + select HW_BREAKPOINTS select HAVE_KVM_MSI select HAVE_KVM_CPU_RELAX_INTERCEPT select KVM_VFIO -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:23:52
|
From: Andi Kleen <ak...@li...> Now that oprofile doesn't need the AMD IBS perf code anymore, we can make that code dependent on PERF_EVENTS and AMD CPU support, instead of unconditionally compiling it in. Cc: bp...@su... Signed-off-by: Andi Kleen <ak...@li...> --- arch/x86/kernel/cpu/Makefile | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 7fd54f0..df9f9a3 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -37,6 +37,9 @@ endif obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_knc.o perf_event_p4.o obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o perf_event_intel_rapl.o +ifdef CONFIG_X86_LOCAL_APIC +obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd_ibs.o +endif endif @@ -44,7 +47,7 @@ obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_MTRR) += mtrr/ obj-$(CONFIG_MICROCODE) += microcode/ -obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o perf_event_amd_ibs.o +obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:23:58
|
Now that PERF_EVENTS is no longer mandatory, ANON_INODES need not be either. Everything that needs ANON_INODES selects it. bloat-o-meter summary: add/remove: 0/10 grow/shrink: 0/0 up/down: 0/-613 (-613) function old new delta anon_inode_mnt 4 - -4 anon_inode_inode 4 - -4 __initcall_anon_inode_init5 4 - -4 anon_inode_fs_type 28 - -28 anon_inodefs_dname 40 - -40 anon_inodefs_dentry_operations 64 - -64 anon_inode_init 79 - -79 anon_inode_getfd 91 - -91 anon_inodefs_mount 93 - -93 anon_inode_getfile 206 - -206 Signed-off-by: Josh Triplett <jo...@jo...> --- arch/x86/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 12e3386..10c51ff 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -74,7 +74,6 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_DEBUG_KMEMLEAK - select ANON_INODES select HAVE_ALIGNED_STRUCT_PAGE if SLUB select HAVE_CMPXCHG_LOCAL select HAVE_CMPXCHG_DOUBLE -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:25:17
|
From: Andi Kleen <ak...@li...> Add ifdefs on CONFIG_HW_BREAKPOINTS to the hardware break point code in x86 ptrace.c. This prepares this file for being able to disable HW_BREAKPOINTS. The only debug register that needs to be handled without break points is DR6, so do that in a simple separate function without breakpoints. Cc: fwe...@gm... Signed-off-by: Andi Kleen <ak...@li...> --- arch/x86/kernel/ptrace.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 7461f50..dda433f 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -561,6 +561,8 @@ static int genregs_set(struct task_struct *target, return ret; } +#ifdef CONFIG_HW_BREAKPOINTS + static void ptrace_triggered(struct perf_event *bp, struct perf_sample_data *data, struct pt_regs *regs) @@ -778,6 +780,34 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n, return rc; } +#else + +/* Without breakpoints only handle DR6 */ +static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n) +{ + struct thread_struct *thread = &tsk->thread; + + if (n == 6) + return thread->debugreg6; + return 0; +} + +/* Without breakpoints only handle DR6 */ +static int ptrace_set_debugreg(struct task_struct *tsk, int n, + unsigned long val) +{ + struct thread_struct *thread = &tsk->thread; + int rc = -EIO; + + if (n == 6) { + thread->debugreg6 = val; + rc = 0; + } + return rc; +} + +#endif + /* * These access the current or another (stopped) task's io permission * bitmap for debugging or core dump. -- 1.9.0 |
From: Josh T. <jo...@jo...> - 2014-03-10 04:25:17
|
From: Andi Kleen <ak...@li...> UPROBES need the perf events code, so add a dependency from PERF_EVENTS to UPROBES. Cc: ro...@go... Signed-off-by: Andi Kleen <ak...@li...> --- kernel/trace/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 015f85a..8639819 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -424,6 +424,7 @@ config UPROBE_EVENT bool "Enable uprobes-based dynamic events" depends on ARCH_SUPPORTS_UPROBES depends on MMU + depends on PERF_EVENTS select UPROBES select PROBE_EVENTS select TRACING -- 1.9.0 |
From: Andi K. <ak...@li...> - 2014-03-10 04:53:45
|
All changes look good to me. Thanks for picking that up. Reviewed-by: Andi Kleen <ak...@li...> -- ak...@li... -- Speaking for myself only |
From: Ingo M. <mi...@ke...> - 2014-03-10 07:44:05
|
* Josh Triplett <jo...@jo...> wrote: > This patch series makes it possible to disable HW_BREAKPOINTS, > PERF_EVENTS, INSTRUCTION_DECODER, IRQ_WORK, and ANON_INODES. > Without this patch series, all of these config options get > automatically selected on x86, making them impossible to disable. > > This is a revival of a previous patch series from Andi Kleen sent in > October 2013. [...] So my main problem with those patches was that if HW_BREAKPOINTS is disabled then GDB 'hbreak' isn't simply disabled but fails in various non-obvious ways, taking down the rest of GDB with it, so it's in essence an ABI and tool breaker which is not very useful. If all of ptrace is disabled then it perhaps behaves more cleanly, but we didn't allow the disabling of ptrace in the past, and I'm not sure we should start with it today. So it's still a NAK for the time being. Thanks, Ingo |
From: Josh T. <jo...@jo...> - 2014-03-10 08:42:45
|
On Mon, Mar 10, 2014 at 08:43:54AM +0100, Ingo Molnar wrote: > > * Josh Triplett <jo...@jo...> wrote: > > > This patch series makes it possible to disable HW_BREAKPOINTS, > > PERF_EVENTS, INSTRUCTION_DECODER, IRQ_WORK, and ANON_INODES. > > Without this patch series, all of these config options get > > automatically selected on x86, making them impossible to disable. > > > > This is a revival of a previous patch series from Andi Kleen sent in > > October 2013. [...] > > So my main problem with those patches was that if HW_BREAKPOINTS is > disabled then GDB 'hbreak' isn't simply disabled but fails in various > non-obvious ways, taking down the rest of GDB with it, so it's in > essence an ABI and tool breaker which is not very useful. It's hidden behind EXPERT for a reason, and non-hardware breakpoints still work just fine. This is the kind of thing enabled on an embedded system, where you're not going to be running GDB at all, let alone using "hbreak". Given that other options depending on EXPERT let you disable things as critical as futexes or ELF binary support... Would it help to have some more detailed help text for HW_BREAKPOINTS, explaining that gdb's "hbreak" and similar will utterly fail if this option is disabled? Alternatively, would it suffice to make ptrace_set_debugreg always return an error when !CONFIG_HW_BREAKPOINTS? Would that produce error behavior you'd prefer? > If all of ptrace is disabled then it perhaps behaves more cleanly, That's on my list for the future. :) - Josh Triplett |
From: Ingo M. <mi...@ke...> - 2014-03-11 10:29:21
|
* Josh Triplett <jo...@jo...> wrote: > On Mon, Mar 10, 2014 at 08:43:54AM +0100, Ingo Molnar wrote: > > > > * Josh Triplett <jo...@jo...> wrote: > > > > > This patch series makes it possible to disable HW_BREAKPOINTS, > > > PERF_EVENTS, INSTRUCTION_DECODER, IRQ_WORK, and ANON_INODES. > > > Without this patch series, all of these config options get > > > automatically selected on x86, making them impossible to > > > disable. > > > > > > This is a revival of a previous patch series from Andi Kleen > > > sent in October 2013. [...] > > > > So my main problem with those patches was that if HW_BREAKPOINTS > > is disabled then GDB 'hbreak' isn't simply disabled but fails in > > various non-obvious ways, taking down the rest of GDB with it, so > > it's in essence an ABI and tool breaker which is not very useful. > > It's hidden behind EXPERT for a reason, [...] With many distros enabling CONFIG_EXPERT=y that's a distinction without much meaning. > [...] and non-hardware breakpoints still work just fine. [...] A lot of other stuff will work just fine as well. Yet my point was that 'hbreak' breaks in ugly ways. > [...] This is the kind of thing enabled on an embedded system, where > you're not going to be running GDB at all, let alone using "hbreak". So that is why I suggested making ptrace configurable. Perhaps. > Given that other options depending on EXPERT let you disable things > as critical as futexes or ELF binary support... Both of which will break in pretty clear ways. The granularity of how we can disable ABIs largely depends on how well actual user-space code handles the failures, and what I'm saying here is that the disabling of hardware breakpoints is too finegrained. Disabling ptrace for embedded OTOH makes sense and gives us even more savings and security advantage. Thanks, Ingo |
From: Oleg N. <ol...@re...> - 2014-03-10 14:22:29
|
On 03/09, Josh Triplett wrote: > > UPROBES need the perf events code, Actually it doesn't. But yes, the kernel/events directory is only built if CONFIG_PERF_EVENTS. See http://marc.info/?t=139201796100001 > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index 015f85a..8639819 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -424,6 +424,7 @@ config UPROBE_EVENT > bool "Enable uprobes-based dynamic events" > depends on ARCH_SUPPORTS_UPROBES > depends on MMU > + depends on PERF_EVENTS Can't we delay this change? This conflicts with [PATCH v7 01/15] uprobes: Kconfig dependency fix http://marc.info/?l=linux-kernel&m=139422332915701 We need to untangle UPROBES and PERF, we can change the Makefile's or move uprobe.c to kernel/. Oleg. |
From: Josh T. <jo...@jo...> - 2014-03-10 15:03:31
|
On Mon, Mar 10, 2014 at 03:21:29PM +0100, Oleg Nesterov wrote: > On 03/09, Josh Triplett wrote: > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > > index 015f85a..8639819 100644 > > --- a/kernel/trace/Kconfig > > +++ b/kernel/trace/Kconfig > > @@ -424,6 +424,7 @@ config UPROBE_EVENT > > bool "Enable uprobes-based dynamic events" > > depends on ARCH_SUPPORTS_UPROBES > > depends on MMU > > + depends on PERF_EVENTS > > Can't we delay this change? This conflicts with > > [PATCH v7 01/15] uprobes: Kconfig dependency fix > http://marc.info/?l=linux-kernel&m=139422332915701 That doesn't conflict; they both make exactly the same change, and they should trivially merge. - Josh Triplett |
From: Oleg N. <ol...@re...> - 2014-03-10 14:36:23
|
On 03/09, Josh Triplett wrote: > > +#ifdef CONFIG_HW_BREAKPOINTS > + > static void ptrace_triggered(struct perf_event *bp, > struct perf_sample_data *data, > struct pt_regs *regs) > @@ -778,6 +780,34 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n, > return rc; > } > > +#else > + > +/* Without breakpoints only handle DR6 */ > +static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n) > +{ > + struct thread_struct *thread = &tsk->thread; > + > + if (n == 6) > + return thread->debugreg6; > + return 0; > +} > + > +/* Without breakpoints only handle DR6 */ > +static int ptrace_set_debugreg(struct task_struct *tsk, int n, > + unsigned long val) > +{ > + struct thread_struct *thread = &tsk->thread; > + int rc = -EIO; > + > + if (n == 6) { > + thread->debugreg6 = val; > + rc = 0; > + } > + return rc; > +} > + > +#endif OK, but then perhaps this patch should also make thread_struct->ptrace_bps[] depend on CONFIG_HW_BREAKPOINTS ? This needs more changes though. In particular, it seems it makes sense to move flush_ptrace_hw_breakpoint() to x86/kernel/ptrace.c, so that it can live in the same ifdef'ed section. copy_threads() needs a cleanup anyway... But I won't insist if you prefer to do this later. Oleg. |
From: Oleg N. <ol...@re...> - 2014-03-10 14:44:01
|
On 03/10, Oleg Nesterov wrote: > > On 03/09, Josh Triplett wrote: > > > > +static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n) > > +{ > > + struct thread_struct *thread = &tsk->thread; > > + > > + if (n == 6) > > + return thread->debugreg6; > > + return 0; > > +} > > + > > +/* Without breakpoints only handle DR6 */ > > +static int ptrace_set_debugreg(struct task_struct *tsk, int n, > > + unsigned long val) > > +{ > > + struct thread_struct *thread = &tsk->thread; > > + int rc = -EIO; > > + > > + if (n == 6) { > > + thread->debugreg6 = val; > > + rc = 0; > > + } > > + return rc; > > +} > > + > > +#endif > > OK, but then perhaps this patch should also make thread_struct->ptrace_bps[] > depend on CONFIG_HW_BREAKPOINTS ? and ->ptrace_dr7 at least. And in fact, why do we need ptrace_get/set_debugreg(DR6) without HW_BREAKPOINTS? Perhaps get/set should just return the error? > But I won't insist if you prefer to do this later. Yes. Oleg. |
From: Oleg N. <ol...@re...> - 2014-03-10 14:51:27
|
On 03/10, Ingo Molnar wrote: > > So my main problem with those patches was that if HW_BREAKPOINTS is > disabled then GDB 'hbreak' isn't simply disabled but fails in various > non-obvious ways, but the kernel should return an error. And this can happen anyway, say, reserve_bp_slot() can fail because all slots are already used by perf. And iirc, gdb always switches to the single-stepping if hbreak fails. Oleg. |
From: Ingo M. <mi...@ke...> - 2014-03-11 10:30:30
|
* Oleg Nesterov <ol...@re...> wrote: > On 03/10, Ingo Molnar wrote: > > > > So my main problem with those patches was that if HW_BREAKPOINTS is > > disabled then GDB 'hbreak' isn't simply disabled but fails in various > > non-obvious ways, > > but the kernel should return an error. And this can happen anyway, say, > reserve_bp_slot() can fail because all slots are already used by perf. > > And iirc, gdb always switches to the single-stepping if hbreak fails. When the old patches were submitted I tried it with modern GDB and it wasn't a good experience. Thanks, Ingo |
From: Josh T. <jo...@jo...> - 2014-03-10 15:15:51
|
On Mon, Mar 10, 2014 at 03:42:55PM +0100, Oleg Nesterov wrote: > On 03/10, Oleg Nesterov wrote: > > > > On 03/09, Josh Triplett wrote: > > > > > > +static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n) > > > +{ > > > + struct thread_struct *thread = &tsk->thread; > > > + > > > + if (n == 6) > > > + return thread->debugreg6; > > > + return 0; > > > +} > > > + > > > +/* Without breakpoints only handle DR6 */ > > > +static int ptrace_set_debugreg(struct task_struct *tsk, int n, > > > + unsigned long val) > > > +{ > > > + struct thread_struct *thread = &tsk->thread; > > > + int rc = -EIO; > > > + > > > + if (n == 6) { > > > + thread->debugreg6 = val; > > > + rc = 0; > > > + } > > > + return rc; > > > +} > > > + > > > +#endif > > > > OK, but then perhaps this patch should also make thread_struct->ptrace_bps[] > > depend on CONFIG_HW_BREAKPOINTS ? > > and ->ptrace_dr7 at least. Those do seem like nice optimizations to reduce the size of thread_struct. > And in fact, why do we need ptrace_get/set_debugreg(DR6) without HW_BREAKPOINTS? Good question. Andi? > Perhaps get/set should just return the error? Perhaps. > > But I won't insist if you prefer to do this later. > > Yes. The optimization of thread_struct I would prefer to defer until after this patch series, yes. But if there's a better way to write ptrace_{g,s}et_debugreg when !CONFIG_HW_BREAKPOINTS, I'll happily change that here. - Josh Triplett |
From: Frederic W. <fwe...@gm...> - 2014-03-10 16:14:23
|
On Sun, Mar 09, 2014 at 08:12:53PM -0700, Josh Triplett wrote: > From: Andi Kleen <ak...@li...> > > Make HW_BREAKPOINTS a config option. HW_BREAKPOINTS depends on perf. > This allows disabling PERF_EVENTS for systems that don't need it (e.g. > anything not used for development) This then allows disabling IRQ_WORK > and INSTRUCTION_DECODER. > > This change reduces the size of allnoconfig by 161k, a full 7.6% > reduction: > text data bss dec hex filename > 788816 131952 1214432 2135200 2094a0 vmlinux-allnoconfig-with-perf > 666361 93020 1214368 1973749 1e1df5 vmlinux-allnoconfig-no-perf > > bloat-o-meter summary: > add/remove: 1/1040 grow/shrink: 1/25 up/down: 82/-152214 (-152132) My review hasn't changed much since last time (https://lkml.org/lkml/2013/10/4/483), so I'm mostly copy pasting it here so that we can discuss if you want: I'm personally not against the idea of allowing hw breakpoint support to be disabled but hpa did not like much the idea. I must say he had valid concerns though, see the thread of this patchset: https://lkml.org/lkml/2011/7/14/163 IMHO, a better solution would be to make the hw breakpoint subsystem work without using perf. I mean, perf should use the hw breakpoint, but hw breakpoint shouldn't use perf, (in fact that's what we agreed on with hpa IIRC). But right now there is a kind of circular dependency between both that makes perf always built-in in x86. I'm all for solving that by making hw breakpoint independant from perf, but I think Ingo had mixed feelings about doing it that way. And he had valid concerns on that as well. So we are a bit stuck :) Also some comments below: > > Suggested-by: Ingo Molnar <mi...@ke...> > Signed-off-by: Andi Kleen <ak...@li...> > [josh: Rebased; HW_BREAKPOINTS marked as EXPERT; commit message updated > with new statistics.] > Signed-off-by: Josh Triplett <jo...@jo...> > --- > arch/x86/Kconfig | 12 +++++++++--- > arch/x86/kernel/Makefile | 3 ++- > arch/x86/kvm/Kconfig | 1 + > 3 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 0af5250..12e3386 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -70,9 +70,6 @@ config X86 > select HAVE_KERNEL_XZ > select HAVE_KERNEL_LZO > select HAVE_KERNEL_LZ4 > - select HAVE_HW_BREAKPOINT > - select HAVE_MIXED_BREAKPOINTS_REGS > - select PERF_EVENTS > select HAVE_PERF_EVENTS_NMI > select HAVE_PERF_REGS > select HAVE_PERF_USER_STACK_DUMP > @@ -587,6 +584,15 @@ config SCHED_OMIT_FRAME_POINTER > > If in doubt, say "Y". > > +config HW_BREAKPOINTS > + bool "Enable hardware breakpoints support" if EXPERT > + default y > + select HAVE_HW_BREAKPOINT > + select HAVE_MIXED_BREAKPOINTS_REGS > + ---help--- > + Enable support for x86 hardware breakpoints for debuggers > + and perf. This will implicitly enable perf-events. > + HW_BREAKPOINTS must depend on HAVE_HW_BREAKPOINT, not the other way round. HAVE_* Kconfig symbols must only express constant backend support, not a variable user choice... ie, that's what I did in this patchset https://lkml.org/lkml/2011/7/14/163 Thanks. |
From: Josh T. <jo...@jo...> - 2014-03-10 17:21:02
|
On Mon, Mar 10, 2014 at 05:14:11PM +0100, Frederic Weisbecker wrote: > On Sun, Mar 09, 2014 at 08:12:53PM -0700, Josh Triplett wrote: > > From: Andi Kleen <ak...@li...> > > > > Make HW_BREAKPOINTS a config option. HW_BREAKPOINTS depends on perf. > > This allows disabling PERF_EVENTS for systems that don't need it (e.g. > > anything not used for development) This then allows disabling IRQ_WORK > > and INSTRUCTION_DECODER. > > > > This change reduces the size of allnoconfig by 161k, a full 7.6% > > reduction: > > text data bss dec hex filename > > 788816 131952 1214432 2135200 2094a0 vmlinux-allnoconfig-with-perf > > 666361 93020 1214368 1973749 1e1df5 vmlinux-allnoconfig-no-perf > > > > bloat-o-meter summary: > > add/remove: 1/1040 grow/shrink: 1/25 up/down: 82/-152214 (-152132) > > My review hasn't changed much since last time (https://lkml.org/lkml/2013/10/4/483), > so I'm mostly copy pasting it here so that we can discuss if you want: > > I'm personally not against the idea of allowing hw breakpoint support > to be disabled but hpa did not like much the idea. I must say he had > valid concerns though, see the thread of this patchset: https://lkml.org/lkml/2011/7/14/163 > > IMHO, a better solution would be to make the hw breakpoint subsystem work > without using perf. It looks like hpa's concern was primarily that it should be possible to have CONFIG_HW_BREAKPOINTS=y without CONFIG_PERF_EVENTS. I agree, but since right now you can't have *either* of them disabled, allowing both to be disabled seems like an improvement. > > +config HW_BREAKPOINTS > > + bool "Enable hardware breakpoints support" if EXPERT > > + default y > > + select HAVE_HW_BREAKPOINT > > + select HAVE_MIXED_BREAKPOINTS_REGS > > + ---help--- > > + Enable support for x86 hardware breakpoints for debuggers > > + and perf. This will implicitly enable perf-events. > > + > > HW_BREAKPOINTS must depend on HAVE_HW_BREAKPOINT, not the other way round. > HAVE_* Kconfig symbols must only express constant backend support, not a variable user choice... > > ie, that's what I did in this patchset > https://lkml.org/lkml/2011/7/14/163 Fair enough. - Josh Triplett |
From: Josh T. <jo...@jo...> - 2014-03-10 20:50:40
|
On Mon, Mar 10, 2014 at 05:14:11PM +0100, Frederic Weisbecker wrote: > On Sun, Mar 09, 2014 at 08:12:53PM -0700, Josh Triplett wrote: > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -70,9 +70,6 @@ config X86 > > select HAVE_KERNEL_XZ > > select HAVE_KERNEL_LZO > > select HAVE_KERNEL_LZ4 > > - select HAVE_HW_BREAKPOINT > > - select HAVE_MIXED_BREAKPOINTS_REGS > > - select PERF_EVENTS > > select HAVE_PERF_EVENTS_NMI > > select HAVE_PERF_REGS > > select HAVE_PERF_USER_STACK_DUMP > > @@ -587,6 +584,15 @@ config SCHED_OMIT_FRAME_POINTER > > > > If in doubt, say "Y". > > > > +config HW_BREAKPOINTS > > + bool "Enable hardware breakpoints support" if EXPERT > > + default y > > + select HAVE_HW_BREAKPOINT > > + select HAVE_MIXED_BREAKPOINTS_REGS > > + ---help--- > > + Enable support for x86 hardware breakpoints for debuggers > > + and perf. This will implicitly enable perf-events. > > + > > HW_BREAKPOINTS must depend on HAVE_HW_BREAKPOINT, not the other way round. > HAVE_* Kconfig symbols must only express constant backend support, not a variable user choice... > > ie, that's what I did in this patchset > https://lkml.org/lkml/2011/7/14/163 So, the converse of that is that makefiles and code should never depend on HAVE_* symbols; they should only be used in Kconfig as dependencies. Right now, piles of things use HAVE_HW_BREAKPOINT to select things to build. So, the biggest chunk of this change is introducing a new dummy symbol HW_BREAKPOINTS depending on HAVE_HW_BREAKPOINT and HAVE_MIXED_BREAKPOINTS_REGS, and changing almost all the references to HAVE_HW_BREAKPOINT to HW_BREAKPOINTS. That's effectively patch 2/6 of your series. Would you consider reviving that patch, or should I pull that patch into this series instead? - Josh Triplett |
From: Andi K. <ak...@li...> - 2014-03-10 17:41:34
|
> > And in fact, why do we need ptrace_get/set_debugreg(DR6) without HW_BREAKPOINTS? > > Good question. Andi? dr6 controls single stepping which is useful even without hw/breakpoints. Basically gdb will still work for 95+% of all use cases, everyone not using watch points. -Andi |
From: Andi K. <ak...@li...> - 2014-03-10 18:18:16
|
On Mon, Mar 10, 2014 at 10:41:06AM -0700, Andi Kleen wrote: > > > And in fact, why do we need ptrace_get/set_debugreg(DR6) without HW_BREAKPOINTS? > > > > Good question. Andi? > > dr6 controls single stepping which is useful even without hw/breakpoints. > > Basically gdb will still work for 95+% of all use cases, everyone > not using watch points. Actually thinking about it more gdb should be using PTRACE_SINGLESTEP, not direct access to DR6. So removing it is likely ok (but should verify) -Andi |