Page 1 of 1

Access and Read

PostPosted: Tue Jun 03, 2014 4:42 pm
by volvicer
Hi I have two questions about PAPI:

1.) What is the difference between a cache read and a cache access? How is the relation to a miss implemented?
2.) I sample PAPI on one socket with 4 cores and each core gather L3 cache misses. Finallay, I sum them up into one value. Is the the right way to do or do I get the same value four times?

Best regards

Re: Access and Read

PostPosted: Thu Jun 05, 2014 8:18 pm
by jagode00
What architecture are you running on?

Re: Access and Read

PostPosted: Thu Jul 03, 2014 9:22 am
by James Ralph
1. That depends upon what architecture you are running on. You can see what hardware events we map the preset event to in papi_events.csv And then look up a definition in vendor documentation.

2. Usually L3 is shared at the socket level, if so, one core measuring L3 misses will be sufficient.


Re: Access and Read

PostPosted: Tue Feb 17, 2015 11:24 am
by volvicer
Hello I am running my tests on a Intel Xeon E5 using Sandy Bridge architecutre.

One question to the csv file. It contains the mappings of the preset events to the native events. However, my question regards the native events. For example, the event perf::PERF_COUNT_HW_CACHE_L1D supports the masks :READ for "read access" and :ACCESS for "hit access". What are the difference between both? The same occurs for perf::PERF_COUNT_HW_CACHE_LL.


Re: Access and Read

PostPosted: Wed Feb 18, 2015 12:46 pm
by jagode00

An ACCESS can be either “read” or “write”.
More specifically, the event enumeration for perf::PERF_COUNT_HW_CACHE_L1D is a 2-dimensional space:
{ load, store, prefetch } x { accesses, misses }
Users have to pass one of each dimension and -- depending on whether or not the hardware supports this type and combo -- the kernel provides a counter.

For example, L1 DCache Loads (reads) should be equivalent to: PERF_COUNT_HW_CACHE_L1D:READ:ACCESS