Page 1 of 1

Roofline Model

PostPosted: Mon Jul 27, 2015 7:49 am
by GrischaJacobs

I would like to make some Roofline measurements on these CPUs (Intel® Xeon® Processor E5-4650, Intel® Xeon® Processor E7-8837, Intel® Xeon® Processor E7-4890 v2, Intel® Xeon® Processor E5-2680 v3) and on the Xeon Phi. A college told me that he evaluated PAPI a while ago and it wouldn't support uncore events for measuring L3 cache misses. As I understood this isn't true anymore, right? I need this metric to compute the operational intensity. Looking at papi_avail I see that PAPI_L3_LDM, PAPI_L3_STM, PAPI_L3_ICM, PAPI_L3_DCM are not available. Will PAPI_L3_TCM be sufficient?

How would I make these measurements at best? Any experiences?

regards, Grischa

Those events are available with: papi_avail | grep misses (installed papi 5.4.0)
PAPI_L1_DCM 0x80000000 Yes No Level 1 data cache misses
PAPI_L1_ICM 0x80000001 Yes No Level 1 instruction cache misses
PAPI_L2_DCM 0x80000002 Yes Yes Level 2 data cache misses
PAPI_L2_ICM 0x80000003 Yes No Level 2 instruction cache misses
PAPI_L3_DCM 0x80000004 No No Level 3 data cache misses
PAPI_L3_ICM 0x80000005 No No Level 3 instruction cache misses
PAPI_L1_TCM 0x80000006 Yes Yes Level 1 cache misses
PAPI_L2_TCM 0x80000007 Yes No Level 2 cache misses
PAPI_L3_TCM 0x80000008 Yes No Level 3 cache misses
PAPI_L3_LDM 0x8000000e No No Level 3 load misses
PAPI_L3_STM 0x8000000f No No Level 3 store misses
PAPI_TLB_DM 0x80000014 Yes Yes Data translation lookaside buffer misses
PAPI_TLB_IM 0x80000015 Yes No Instruction translation lookaside buffer misses
PAPI_TLB_TL 0x80000016 No No Total translation lookaside buffer misses
PAPI_L1_LDM 0x80000017 Yes No Level 1 load misses
PAPI_L1_STM 0x80000018 Yes No Level 1 store misses
PAPI_L2_LDM 0x80000019 No No Level 2 load misses
PAPI_L2_STM 0x8000001a Yes No Level 2 store misses
PAPI_BTAC_M 0x8000001b No No Branch target address cache misses
PAPI_PRF_DM 0x8000001c No No Data prefetch cache misses