From: Jason Y. <jas...@am...> - 2008-01-29 21:49:24
|
Hi, This is the first attempt to extend Oprofile to support Instruction Based Sampling (IBS) available on AMD Family 10h processors. The specification of IBS is described in section 2.17.2 of "BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors". IBS provides wide range of precise information on instruction fetch phase and execution phase. The document "Instruction-Based Sampling: A New Performance Analysis Technique for AMD Family 10h Processors" explains and demonstrates the uses of IBS in details. The patches are made against the head of CVS. They requires a separate kernel patch to work correctly on Family 10h processor on existing kernel. Design Outline ================ = Terms = EBS: Event based sampling IBS: Instructions based sampling = opcontrol changes = Three new options are added to opcontrol script. "--ibs-fetch=#count" and "ibs-op=#count" enabl and specify the max count for IBS fetch and op repectively. "--ibs-fetch=0" and --ibs-op=0" disable IBS. "--no-event" is added to clear the current event selection in daemonrc. Profile session can be taken while simultaneously enable IBS and EBS. They can also be used independently from each other. = Driver interface changes = Two directories, ibs_fetch and ibs_uops are added to the oprofilefs allowing the control of MSRs through oprofile.ko module. Both directories contains device file enable and max_count. The file "enable" enables and disables the functionalities of the directory containing it. The "max_count" file specifies the maximum count value of the periodic op/fetch counter (bit 15:0 of MSR 0xC001_1030 and 0xC001_1033). Directory "ibs_fetch" contains "ran_enable" file in addition to the files mentioned. It corresponds to bit 57 of MSR 0xC001_1030. When enabled, bits 3:0 of the fetch counter are randomized when IBS fetch is set to start the fetch counter. = Daemon changes = To differentiate IBS events from EBS events and to accommodate the fact that IBS events are not uniform in length when read from buffer. Two escape codes "IBS_FETCH_SAMPLE" and "IBS_OP_SAMPLE" and their handlers are added. Each IBS sample contains encapsulates multitudes of data. For example, single IBS fetch data contains information of instruction cache L2TLB miss, instruction cache L1TLB miss, L1 TLB page size, instruction cache miss, linear address, physical address, etc. To use the current code in the daemon, one IBS event is expanded to a number of event based sampling (EBS) events in the escape code handler. The EBS events are then processed by existing daemon code and written to sample files. = Reporting tool changes = Virtual address associated with IBS fetch may lie in the middle of an instruction. opreport and opannotate are modified to take this into consideration when printing out report. = Known issues = The escape code collision issue between Xenoprofile and Cell process also affects the escape code values used by IBS handler. Farther discussion is needed to settle this issue. The current implementation of the reporting tool caches the entire output before printing. It actually only needs to save one previous line. References ================ "BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors" http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/31116.pdf Drongowski, Paul. "Instruction-Based Sampling: A New Performance Analysis Technique for AMD Family 10h Processors". 2007. http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf |