Open discussion of PAPI.


Postby cpartie » Mon Apr 28, 2014 5:11 pm


I have a Nehalem CPU and I would like to count the FLOPs that my code executes. My code comprises a for loop with only double precision operations, here:
Code: Select all
        #define INDEX 10
        unsigned int Events[2] = {PAPI_SP_OPS,PAPI_DP_OPS};
   long long values[2];

      /* Initialize the Matrix here */

      if(PAPI_start_counters((int*)Events,2) != PAPI_OK)
         printf("ERROR at init.");

      /* Matrix-Matrix multiply */
         for ( j = 0; j < INDEX; j++ )
            for ( k = 0; k < INDEX; k++ )
                  mresult[k][j] = mresult[k][j]/2;

      if(PAPI_stop_counters(values,2)!= PAPI_OK)
         printf("ERROR at end.");

      printf( "\n \n single precision: %lld double precision: %lld \n \n", values[0],values[1] );

When I compile it with the -O2 flag, I get the following,

single precision: 0 double precision: 100

which is what I expected.

When I compile with the -O3 flag, I get the following,
single precision: 150 double precision: 100

I know I should only be looking at the PAPI_DP_OPS value, but I am curious to know why exactly PAPI_SP_OPS is being incremented when I vectorize the for-loop due to the -O3 flag.
Posts: 1
Joined: Mon Apr 28, 2014 4:52 pm

Return to General discussion (read-only)

Who is online

Users browsing this forum: No registered users and 1 guest