Hi,
What is the most accurate way to measure trace cache miss rate on Xeon (dual core) using pfmon? I looked into TC_DELIVER_MODE but it counts the cycles for each operation mode (Deliver or Build), and it also include cycle counts for both processors (I am only interested in the one running my code).

thanks,
- nagy

My cpu info:
model name    : Intel(R) Xeon(TM) CPU 2.40GHz
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts

Events available:
$ pfmon -l
TC_DELIVER_MODE
BPU_FETCH_REQUEST
ITLB_REFERENCE
MEMORY_CANCEL
MEMORY_COMPLETE
LOAD_PORT_REPLAY
STORE_PORT_REPLAY
MOB_LOAD_REPLAY
PAGE_WALK_TYPE
BSQ_CACHE_REFERENCE
IOQ_ALLOCATION
IOQ_ACTIVE_ENTRIES
FSB_DATA_ACTIVITY
BSQ_ALLOCATION
BSQ_ACTIVE_ENTRIES
SSE_INPUT_ASSIST
PACKED_SP_UOP
PACKED_DP_UOP
SCALAR_SP_UOP
SCALAR_DP_UOP
64BIT_MMX_UOP
128BIT_MMX_UOP
X87_FP_UOP
TC_MISC
GLOBAL_POWER_EVENTS
TC_MS_XFER
UOP_QUEUE_WRITES
RETIRED_MISPRED_BRANCH_TYPE
RETIRED_BRANCH_TYPE
RESOURCE_STALL
WC_BUFFER
B2B_CYCLES
BNR
SNOOP
RESPONSE
FRONT_END_EVENT
EXECUTION_EVENT
REPLAY_EVENT
INSTR_RETIRED
UOPS_RETIRED
UOPS_TYPE
BRANCH_RETIRED
MISPRED_BRANCH_RETIRED
X87_ASSIST
MACHINE_CLEAR
INSTR_COMPLETED