I am experimenting PAPI and hardware counter on Power8. I think very is an issue PM_FLOP computation (used by PAPI_flop).

PM_FLOP is the sum of the following counter: PM_VSU{i}_{j}FLOP where i belongs to " 1 and 2 (2 VUS per core)", j is 1,2,4,8.

This 2 numbers matches nevertheless if we analyses the matrix-hl.c test we can have a pb.

At the end of the test there is error catcher:

- Code: Select all
`if ( event[0] == PAPI_FP_INS ) {`

/* Compare measured FLOPS to expected value */

tmp =

2 * ( long long ) ( NROWS1 ) * ( long long ) ( NCOLS2 ) *

( long long ) ( NCOLS1 );

printf("%llu \n",tmp);

if ( abs( ( int ) values[0] - ( int ) tmp ) > ( float ) tmp * 0.05 ) {

/* Maybe we are counting FMAs? */

tmp = tmp / 2;

if ( abs( ( int ) values[0] - ( int ) tmp ) >

( float ) tmp * 0.05 ) {

printf( "\n" TAB1, "Expected operation count: ", 2 * tmp );

printf( TAB1, "Or possibly (using FMA): ", tmp );

printf( TAB1, "Instead I got: ", values[0] );

test_fail( __FILE__, __LINE__,

"Unexpected FLOP count (check vector operations)",

1 );

}

}

}

There is no error nevertheless If I remove the first branching and compile the test with -O3 for float and double I get

- Code: Select all
`Expected operation count: 11812500`

Or possibly (using FMA): 5906250

Instead I got: 3003761

matrix-hl.c - DOUBLE FAILED

Expected operation count: 11812500

Or possibly (using FMA): 5906250

Instead I got: 1552507

matrix-hl.c - FLOAT FAILED

Presently I think the computation of PM_FLOP is wrong. For me every PM_VSU{i}_{j}FLOP is not the number of flop but the number of mnemonic that is completed. Consequently PM_VSU{i}_{j}FLOP should be correct by factor: x1 for M_VSU{i}_1FLOP, x2 for M_VSU{i}_2FLOP, x4 for M_VSU{i}_4FLOP and x8 for M_VSU{i}_8FLOP.

In fact your test work because you are compiling with -O0 consequently the ASM generated has only serial operations, measured by M_VSU{i}_1FLOP, where 1 scalar mnemonic is one flop. I did some test on dgemm and basic vector addition and FMA, that's confirme my correction.

If you have an access to Power8, could you verify ?

Best,

Tim