From: Robert R. <rob...@am...> - 2011-11-21 16:01:23
|
CC'ing Maynard and Suravee, please look esp. at my comments about ops->cpu_type = "s390x/basic_mode_sampling_v1"; and how userland should determine min/max value for the sampling period. On 17.11.11 20:37:23, Andreas Krebbel wrote: > From: Andreas Krebbel <kr...@li...> > Existing > user space tools will complain about an unknown cpu type. In order to > be compatible with existing user space tools the `force_timer_cputype' > module parameter has been added. Setting the parameter to 1 will > force the module to return `timer' as cpu_type. The module will still > try to use hardware sampling if available and the hwsampling virtual > filesystem will be also be available for configuration. So this has a > different effect than using the generic oprofile module parameter > `timer=1'. Please use oprofile.cpu_type=timer here and also update kernel-parameters.txt, e.g.: oprofile.cpu_type= Force an oprofile cpu type This might be useful if you have an older oprofile userland or if you want common events. Format: { arch_perfmon } arch_perfmon: [X86] Force use of architectural perfmon on Intel CPUs instead of the CPU specific event set. timer: [X86] Force use of architectural NMI timer mode (see also oprofile.timer for generic hr timer mode) [s390] Force legacy basic mode sampling (report cpu_type "timer") There are already changes, to avoid conflicts, please use: git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git core > The following new files are provided as part of the OProfile virtual > filesystem: > > /dev/oprofile/0/enabled > /dev/oprofile/0/event > /dev/oprofile/0/count > /dev/oprofile/0/unit_mask > /dev/oprofile/0/kernel > /dev/oprofile/0/user > > In the counter file system only the values of 'event', 'count', > 'kernel', and 'user' are evaluated by the kernel module. Everything > else must contain fixed values. > > The 'event' value is used to disable (0) or enable (1) hardware > sampling. > > The 'count' value specifies the hardware sampling rate as it is passed > to the CPU measurement facility. > > The 'kernel' and 'user' flags can now be used to filter for samples > when using hardware sampling. The meaning of the flags is inverted to have 0 as defaults, see also below. > > Signed-off-by: Andreas Krebbel <kr...@li...> > --- > arch/s390/oprofile/hwsampler.c | 7 + > arch/s390/oprofile/init.c | 240 +++++++++++++++++++++++++++++++++++++++- > arch/s390/oprofile/op_counter.h | 22 +++ > 3 files changed, 263 insertions(+), 6 deletions(-) > > --- a/arch/s390/oprofile/hwsampler.c > +++ b/arch/s390/oprofile/hwsampler.c > @@ -22,6 +22,7 @@ > #include <asm/irq.h> > > #include "hwsampler.h" > +#include "op_counter.h" > > #define MAX_NUM_SDB 511 > #define MIN_NUM_SDB 1 > @@ -896,6 +897,8 @@ static void add_samples_to_oprofile(unsi > if (sample_data_ptr->P == 1) { > /* userspace sample */ > unsigned int pid = sample_data_ptr->prim_asn; > + if (!counter_config.user) > + goto skip_sample; > rcu_read_lock(); > tsk = pid_task(find_vpid(pid), PIDTYPE_PID); > if (tsk) > @@ -903,6 +906,8 @@ static void add_samples_to_oprofile(unsi > rcu_read_unlock(); > } else { > /* kernelspace sample */ > + if (!counter_config.kernel) > + goto skip_sample; As said, counter_config.kernel means not to profile the kernel. Same for .../user. > regs = task_pt_regs(current); > } > > @@ -910,7 +915,7 @@ static void add_samples_to_oprofile(unsi > oprofile_add_ext_hw_sample(sample_data_ptr->ia, regs, 0, > !sample_data_ptr->P, tsk); > mutex_unlock(&hws_sem); > - > + skip_sample: > sample_data_ptr++; > } > } > --- a/arch/s390/oprofile/init.c > +++ b/arch/s390/oprofile/init.c > @@ -2,7 +2,7 @@ > * arch/s390/oprofile/init.c > * > * S390 Version > - * Copyright (C) 2003 IBM Deutschland Entwicklung GmbH, IBM Corporation > + * Copyright (C) 2003-2011 IBM Deutschland Entwicklung GmbH, IBM Corporation > * Author(s): Thomas Spatzier (ts...@de...) > * Author(s): Mahesh Salgaonkar (ma...@li...) > * Author(s): Heinz Graalfs (gr...@li...) > @@ -14,6 +14,7 @@ > #include <linux/init.h> > #include <linux/errno.h> > #include <linux/fs.h> > +#include <linux/module.h> > > #include "../../../drivers/oprofile/oprof.h" > > @@ -22,6 +23,7 @@ extern void s390_backtrace(struct pt_reg > #ifdef CONFIG_64BIT > > #include "hwsampler.h" > +#include "op_counter.h" > > #define DEFAULT_INTERVAL 4127518 > > @@ -38,8 +40,15 @@ static unsigned long oprofile_sdb_blocks > static int hwsampler_file; > static int hwsampler_running; /* start_mutex must be held to change */ > > +static int force_timer_cputype = 0; > +module_param(force_timer_cputype, int, 0); > +MODULE_PARM_DESC(force_timer_cputype, "1 forces the oprofile kernel module " > + "to return \"timer\" as cpu_type."); oprofile.cpu_type=timer > + > static struct oprofile_operations timer_ops; > > +struct op_counter_config counter_config; > + > static int oprofile_hwsampler_start(void) > { > int retval; > @@ -72,6 +81,8 @@ static void oprofile_hwsampler_stop(void > return; > } > > +/* /dev/oprofile/hwsampling/hwsampler file ops. */ > + > static ssize_t hwsampler_read(struct file *file, char __user *buf, > size_t count, loff_t *offset) > { > @@ -109,21 +120,205 @@ static const struct file_operations hwsa > .write = hwsampler_write, This function should only allow the values 0 or 1 to be able to extend this. Currently other values may be passed. > }; > > +/* > + * /dev/oprofile/hwsampling/hw_interval file ops. > + * Check that the value is within the hardware range. > + */ > + > +static ssize_t hw_interval_read(struct file *file, char __user *buf, > + size_t count, loff_t *offset) > +{ > + return oprofilefs_ulong_to_user(oprofile_hw_interval, buf, > + count, offset); > +} > + > +static ssize_t hw_interval_write(struct file *file, char const __user *buf, > + size_t count, loff_t *offset) > +{ > + unsigned long val; > + int retval; > + > + if (*offset) > + return -EINVAL; > + retval = oprofilefs_ulong_from_user(&val, buf, count); > + if (retval) > + return retval; > + if (val < oprofile_min_interval || val > oprofile_max_interval) > + return -EINVAL; Instead of returning -EINVAL you should set the min/max value here and assign it to oprofile_hw_interval. Userland then should read the value back and fail if they are different. Doing so we no longer need /dev/oprofile/hwsampling/. > + oprofile_hw_interval = val; > + > + return count; > +} > +static ssize_t hwsampler_kernel_read(struct file *file, char __user *buf, > + size_t count, loff_t *offset) > +{ > + return oprofilefs_ulong_to_user(counter_config.kernel, > + buf, count, offset); Must return 0 for !hwsampler_file. > +} > + > +static ssize_t hwsampler_kernel_write(struct file *file, char const __user *buf, > + size_t count, loff_t *offset) > +{ > + unsigned long val; > + int retval; > + > + if (*offset) > + return -EINVAL; > + > + retval = oprofilefs_ulong_from_user(&val, buf, count); > + if (retval) > + return retval; > + > + /* No filtering with timer based sampling. */ > + if (!hwsampler_file && val == 0) > + return -EINVAL; > + > + counter_config.kernel = val; Only allow 0 and 1 here. > + > + return count; > +} > + > +static const struct file_operations kernel_fops = { > + .read = hwsampler_kernel_read, > + .write = hwsampler_kernel_write, > +}; > + > +/* > + * /dev/oprofile/0/user file ops. > + * This file can only be set to 0 for hardware sampling mode. > + */ > + > +static ssize_t hwsampler_user_read(struct file *file, char __user *buf, > + size_t count, loff_t *offset) > +{ > + return oprofilefs_ulong_to_user(counter_config.user, > + buf, count, offset); Should return 0 for !hwsampler_file. > +} > + > +static ssize_t hwsampler_user_write(struct file *file, char const __user *buf, > + size_t count, loff_t *offset) > +{ > + unsigned long val; > + int retval; > + > + if (*offset) > + return -EINVAL; > + > + retval = oprofilefs_ulong_from_user(&val, buf, count); > + if (retval) > + return retval; > + > + /* No filtering with timer based sampling. */ > + if (!hwsampler_file && val == 0) > + return -EINVAL; > + > + counter_config.user = val; Same here, allow only 0 and 1 values. > + > + return count; > +} > + > +static const struct file_operations user_fops = { > + .read = hwsampler_user_read, > + .write = hwsampler_user_write, > +}; > + > + > static int oprofile_create_hwsampling_files(struct super_block *sb, > struct dentry *root) > { > - struct dentry *hw_dir; > + struct dentry *hw_dir, *dir; > > /* reinitialize default values */ > hwsampler_file = 1; > + counter_config.kernel = 1; > + counter_config.user = 1; Both parameters usually mean 'exclude_user/exclude_kernel' to have the defaults 0. Thus, bits must be inverted in the code. [...] > + ops->cpu_type = "s390x/basic_mode_sampling_v1"; We want to keep the cpu_type naming scheme of oprofile consistent which is s390/<model> "s390" is the arch string in Linux. Please choose a model name. We want to support multiple features with the interface and thus can't use feature names here. This would break then. Userland should check for available system features by checking for files in oprofilefs which wouldn't exist if the feature is missing. So if s390/<model> is reported you could rely on the existence of basic mode sampling if /dev/oprofile/0 exists. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center |