From: Harry M. <hj...@ta...> - 2005-12-16 03:10:58
|
The section on unloading the kernel module is unclear: http://www.cs.utk.edu/~vose/c-stuff/oprofile.html#unloadable ==== 4.2. Unloading the kernel module Note This section applies to 2.4 kernels only, OProfile in 2.5 can be unloaded safely. The kernel module can be unloaded, but is designed to take very little memory when profiling is not underway. There is no need to unload the module between profiler runs. lsmod and similar utilities will still show the module's use count as -1. However, this is not to be relied on - the module will become unloadable some short time after stopping profiling. Note that by default module unloading is disabled when used on SMP systems. This is because of a small chance of a module unload race crashing the kernel. As the race is very small, it is allowed to re-enable the module unload by specifying the "allow_unload" parameter to the module : modprobe oprofile allow_unload=1 This option can be DANGEROUS and should only be used on non-production systems. === Does this mean that on kernels >2.5, the module can be unloaded safely even if it IS an SMP system? Or that regardless of kernel version, SMP systems will not allow a module unload? I'm running oprofile on a SMP opteron kernel 2.6.11, with the PAPI patch and cannot unload the oprofile module. Because of that, I cannot run the PAPI-enabled HPCToolkit without rebooting. I guess the only way to unload it is to load it with the above line and risk a crash? Or is there other magic that would force the unload? Thanks! Harry |
From: John L. <le...@mo...> - 2005-12-16 03:20:51
|
On Thu, Dec 15, 2005 at 07:03:54PM -0800, Harry Mangalam wrote: > Does this mean that on kernels >2.5, the module can be unloaded safely even > if it IS an SMP system? Yes. Use opcontrol --deinit (or unmount oprofilefs and unload directly). john |
From: Harry M. <hj...@ta...> - 2005-12-20 18:08:13
|
FAQ contribution: This is probably longer and less specific than what you wanted, but I had to write up this little HOWTO stanza for our own group, so you're welcome to use whatever part of it you'd like for inclusion or clarification in your own docs. The end of the Oprofile section has a bit that discusses the conflict between Oprofile and the PAPI/HPCToolkit approach. If I've misrepresented oprofile, please let me know - I'm but a simple user. FAQ entry for the usage & interaction of PAPI & Oprofile Many Linux machines will be set up to use both oprofile (now available in the 2.6 kernel source as a module [CONFIG_OPROFILE=m]) and tools which require the PAPI API to do performance profiling (such as the HPCToolkit's hpcrun [http://www.hipersoft.rice.edu/hpctoolkit/]). The latter requires a kernel source patch and recompile to enable the PAPI API under Linux, as well as the compilation of the 'perfctr' kernel module. I believe that other SW that uses the PAPI API under Linux (such as U.Orgegon's sophisticated Tuning and Analysis Utilities (tau [http://www.cs.uoregon.edu/research/tau/home.php]) also uses the perfctr module. Many distribution kernels come with the oprofile module enabled; none that I'm aware of come with the PAPI patches applied and usable (a shame - it's very useful to developers). Both software approaches are quite useful and yield complementary (& some overlapping) information and both have distinct advantages. However, the two approaches cannot be used successively without some caution. Since both kernel modules access some of the same resources, one must be unloaded before the other is used. In using oprofile, the web site is a good place to start - the documentation and especially the examples are extremely useful. [http://oprofile.sourceforge.net/docs] NB: The Ubuntu distro that I use has no root user, so all root commands are prefaced with 'sudo' to indicate a root-requiring command. On those systems with root users, you could do this as root or even enable a root shell on a Ubuntu-like distro with 'sudo bash'. Oprofile first requires the module loading: $ sudo modprobe oprofile Second, initialize the 'oprofiled' daemon and start it collecting info. This is a different approach from the HPCToolkit and allows oprofile to analyze not only the application under investigation but the entire system for the time being profiles including the kernel itself. The HPCToolkit is specific for particular applications and as such does not require a daemon running. $ sudo opcontrol --vmlinux=/path/to/vmlinux Or when you don't have a vmlinux or don't want to profile the kernel $ opcontrol --no-vmlinux NOTE that this is the UNCOMPRESSED linux elf executable, not the typical vmlinuz compressed boot sector that is installed in the /boot dir In the case of my machine: $ sudo opcontrol --vmlinux=/usr/src/linux-2.6.11/vmlinux This machine is a dual opteron. If I wanted to profile each CPU separately, I would invoke it with: $ sudo opcontrol --separate=cpu --vmlinux=/usr/src/linux-2.6.11/vmlinux to report profiling on both CPUs Note that once enabled for BOTH CPUs, you have to explicitly shut it off for succeeding runs where you want the results pooled for both CPUs. $ sudo opcontrol --separate=none --vmlinux=/usr/src/linux-2.6.11/vmlinux Next, start the profiling with: $ sudo opcontrol --start When ready to collect info, do a 'sudo ls' to init the timeout on the sudo command so later ones don't ask for passwords, then for an application (an executable called ncbo in the following example) and assuming that it has been compiled with the '-g' flag: # first reset the counters: $ sudo opcontrol --reset # execute the command $ /home/hjm/nco/bin/ncbo -h -O --op_typ='-' -p /home/hjm/nco_bm \ ipcc_dly_T85.nc ipcc_dly_T85_00.nc /home/hjm/nco_bm/ipcc.diff.nc # this command runs for > 60s, important as it's a statistical profiler # when the program ends, dump the collected statistics $ opreport --exclude-dependent --demangle=smart --symbols > \ oprofile.report.full.ncbo The above stanza is meant to be run as a shell or moused into a shell window so there is minimal delay from resetting the counters to running the proram to generating the output. This ensures that the profiling data is specific to the application that is running. The output is a human-readable text file that will give you the time spent in each function. The poll_idle time is that time which the CPU(s) has spent doing NOTHING. ie idling. For a lightly loaded dual-CPU machine, you would expect to obtain about 50% in poll_idle running a single serial job. Cleaning up after Oprofile. Since Oprofile runs as a daemon, it adds a very small amount of CPU and memory overhead to a running system. To remove that overhead, you have to explicitly kill the daemon: $ sudo opcontrol --shutdown This next part is not well-documented and only causes a problem if you want to run a PAPI-based profiler such as hpcrun. You MUST remove the oprofile module and this cannot be done via the usual 'rmmod oprofile' approach. There is a specific command to do it: $ opcontrol --deinit If the oprofile module is loaded and you try to run 'hpcrun' (even to get a list of available options), you'll get an unhelpful error like this: $ hpcrun -L (pid 27342): PAPI library initialization failure - expected version 50397184, dynamic library was version -3. Aborting. This is diagnostic (I believe) that the oprofile module is still loaded and that the perfctr and oprofile modules are fighting over the CPU. Using hpcrun: ============= Don't forget that in order for the hpcrun to work, the perftr module has to be modprobe-loaded AND /dev/perfctr has to be chmod to 644. Using the HPCToolkit: first make sure that the oprofile module is not loaded: $ lsmod |grep oprofile should return nothing. If it gives you an indication that the oprofile module IS loaded, unload it with the command: $ sudo opcontrol --deinit then load the perfctr module to allow the PAPI API access to the hardware counters. $ modprobe perfctr After this, it is relatively straightforward. Anything you want to profile, just run it behind the 'hpcrun' command: $hpcrun (options) -- home/hjm/nco/bin/ncbo -h -O --op_typ='-' -p /home/hjm/nco_bm \ ipcc_dly_T85.nc ipcc_dly_T85_00.nc /home/hjm/nco_bm/ipcc.diff.nc the (options) are typically a set of hardware counters you want to access during the run. On an Opteron, the available options can be got by running: $ hpcrun -L |grep Yes 517 $ hpcrun -L |grep Yes PAPI_L2_DCM Yes Level 2 data cache misses () PAPI_L2_ICM Yes Level 2 instruction cache misses () PAPI_FPU_IDL Yes Cycles floating point units are idle () PAPI_TLB_DM Yes Data translation lookaside buffer misses () PAPI_TLB_IM Yes Instruction translation lookaside buffer misses () PAPI_L1_LDM Yes Level 1 load misses () PAPI_L1_STM Yes Level 1 store misses () PAPI_L2_LDM Yes Level 2 load misses () PAPI_L2_STM Yes Level 2 store misses () PAPI_STL_ICY Yes Cycles with no instruction issue () PAPI_HW_INT Yes Hardware interrupts () PAPI_BR_TKN Yes Conditional branch instructions taken () PAPI_BR_MSP Yes Conditional branch instructions mispredicted () PAPI_TOT_INS Yes Instructions completed () PAPI_FP_INS Yes Floating point instructions () PAPI_BR_INS Yes Branch instructions () PAPI_VEC_INS Yes Vector/SIMD instructions () PAPI_RES_STL Yes Cycles stalled on any resource () PAPI_TOT_CYC Yes Total cycles () PAPI_L2_DCH Yes Level 2 data cache hits () PAPI_L1_DCA Yes Level 1 data cache accesses () PAPI_L2_DCR Yes Level 2 data cache reads () PAPI_L2_DCW Yes Level 2 data cache writes () PAPI_L2_ICH Yes Level 2 instruction cache hits () PAPI_L1_ICA Yes Level 1 instruction cache accesses () PAPI_L1_ICR Yes Level 1 instruction cache reads () PAPI_FML_INS Yes Floating point multiply instructions () PAPI_FAD_INS Yes Floating point add instructions () PAPI_FP_OPS Yes Floating point operations () these options can be requested by inserting them into the (options) space, for example, as: $ hpcrun -e PAPI_TOT_CYC:32767 -e PAPI_FP_OPS:32767 -e PAPI_FP_INS:32767\ -e PAPI_HW_INT:32767 -e PAPI_L2_DCM:32767 -- <command to profile> [don't forget the '--' separator between the hpcrun command chain and the application] hpcrun will profile EVERYTHING that results from the <command to profile> so if it's a shell command, it will profile every subcommand in the shell, giving each its own output file in the form of: <app_name>.PAPI_TOT_CYC.clay.ess.uci.edu.10137.0 The output files you're interested in can be processed into something usable with 'hpcquick', a perl script that calls some other HPC tools to generate the XML DB (in its own subdirectory) that the java browser 'hpcviewer' needs. # src location hpcrun output file to process vvvvvvv vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv hpcquick -I src/nco -P ncwa.PAPI_TOT_CYC.clay.ess.uci.edu.10137.0 # view the results via java hpcviewer hpcviewer # and open the './hpcquick.dbxxx/hpcquick.hpcviewer' file. This will open a java-based source and data browser that can show you where your application is spending time. John Levon wrote: > On Thu, Dec 15, 2005 at 07:03:54PM -0800, Harry Mangalam wrote: > >> Does this mean that on kernels >2.5, the module can be unloaded safely >> even if it IS an SMP system? > > Yes. Use opcontrol --deinit (or unmount oprofilefs and unload directly). > > john > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log > files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click |