This release includes several enhancements for Intel MIC (Xeon Phi) architectures, including support for offload code in addition to the previously released support for native code. See http://icl.cs.utk.edu/trac/papi/browser/INSTALL.txt for details. In addition to offload support, we've enhanced support for host-side power reading from MIC and added a utility to aid in plotting the results.
Intel finally admitted that Ivy Bridge supports Floating Point measurement at least as well as Sandy Bridge and added Floating Point events to the official event table. PAPI 5.3 supports them too. See the PAPI topic: "Counting Floating Point on Sandy Bridge and Ivy Bridge" at http://icl.cs.utk.edu/projects/papi/wiki/PAPITopics:SandyFlops for details.
The linux-rapl component had a problem with dynamic range. The length of time you could measure was a function of the (random) starting value. This component has been rewritten to insure access to the full 32-bits of dynamic range, and a test, rapl_wraparound, has been provided to estimate how long you can measure a naive gemm. The cautionary note is that you no longer get an error message on overflow, so you need to check your timings and results for reasonableness. See the PAPI topic: "AccessingRAPL" at http://icl.cs.utk.edu/projects/papi/wiki/PAPITopics:RAPL_Access for more details.
We made some major changes in the way we handle ctests. First, many of the ctests were built based on outdated configure switches in the makefile(s). We rewrote the tests to determine at runtime whether or not they can run. This may result in more tests executing on your systems than in the past. Enjoy! Next, recognizing the value of our test suite as example code, we restructured the way make install-all works. This option now creates location independent makefiles that will allow you to clone your own copy of the tests directory and modify these tests for your own purposes.
There have been several other bug fixes and enhancements:
- the Intel Haswell event table now supports PAPI_L1_ICM
- AMD Bulldozer now supports Core select masks
- the CUDA component now properly reports the number of native events
- the command_line utility no longer skips the last event on a list
- icc builds no longer add an extraneous -openmp flag
The PAPI 5.3.0 tarball can be downloaded here:
http://icl.cs.utk.edu/projects/papi/dow ... 3.0.tar.gz
for the PAPI team