(The following discussion does not apply to newer quad-core and higher Opteron processors)
The AMD Opteron is the first chip series from AMD that can measure and report floating point operations. Two native events measure floating point activity. One measures speculative operations that enter the FP units; the other measures operations that retire from the FP units.
The retired event generates precise event counts that scale with the amount of work done. However, it measures data movement as well as floating point operations, resulting in counts that are consistently significantly higher than the expected theoretical counts, often by factors of 2 or more.
The speculative event can be configured to generate counts of only the operations typically of interest. Since these counts are speculative, they tend to be higher by often widely variable amounts than expected theoretical counts, especially on complex production codes.
PAPI provides 2 preset events to count floating point operations:
- PAPI_FP_INS counts intstructions passing through the floating point unit;
- PAPI_FP_OPS is intended to count something closer to theoretical floating point operations.
To minimize the overlap and maximize the usefulness of these two events on AMD Opteron, we have made the following choices:
- PAPI_FP_INS always counts retired floating point operations. This value will be precise and accurate, but will include FP loads and stores as well as computations.
- PAPI_FP_OPS counts speculative computation operations by default, but can be customized as discussed below.
As an alternative to counting speculative computations, PAPI_FP_OPS can be configured to retired operations corrected for data movement. Unfortunately, the correction factors themselves are speculative, and can lead to undercounting errors similar in magnitude to those seen in the pure speculative counts.
Two methods are provided to allow customization of PAPI_FP_OPS:
1) The PAPI_OPTERON_FP_xxx defines.
Set these in the CFLAGS variable of Makefile.linux-perfctr-opteron.
-DPAPI_OPTERON_FP_RETIRED
-DPAPI_OPTERON_FP_SSE_SP
-DPAPI_OPTERON_FP_SSE_DP
-DPAPI_OPTERON_FP_SPECULATIVE
The default value is equivalent to:
-DPAPI_OPTERON_FP_SPECULATIVE.
2) The PAPI_OPTERON_FP environment variable.
Set this to one of the following, and it will change the
behavior of PAPI_FP_OPS.
RETIRED: count all retired FP instructions
SSE_SP: correct retired counts optimized for single precision
SSE_DP: correct retired counts optimized for double precision
SPECULATIVE: count speculative computations (default)