I started doing some testing with OpenSpeedShop, using Papi for hardware counters, and am finding inconsistencies between
the older dual core (275) and the newer quad core (2380) Opterons for our compute nodes in one of our clusters.
Going back to the Papi tests, I see that f.ex. for the ctests/flops, I get very different counts for the two.
The source for this test, states that the number of flops I should be seeing on Intel compatible architectures should be 2*(INDEX^3).
With INDEX=1000, I get Total flpins: 2000000000 on the quad core systems, but I get Total flpins: 6000000000 on the dual core systems.
I have tried playing with the environment variable PAPI_OPTERON_FP, to see it this will make a difference, no avail.
Any suggestions for how to make the two types of Opteron systems consistent?