PAPI results vary

Open discussion of PAPI.

PAPI results vary

Postby Jingcha » Thu Apr 26, 2012 5:50 pm

Hello Everyone
I am playing around with PAPI to do some performance measurements, specifically, TOT_CYC, TOT_INS, L2_DCM.
However, I am not quiet sure as to why the results vary. I tried 4 scenario and compared the results:
Scenario 1:
Code: Select all
PAPI_start()
for i 0 to 500
call func()
end for
PAPI_end()

Results: TOT_CYC = 23330897, TOT_INS = 79542, L2_DCM = 779

Scenario 2:
Code: Select all
for i 0 to 500
PAPI_start()
call func()
PAPI_end
end for
sum_the_values()

Result: TOT_CYC = 443303 TOT_INS = 146898 L2_DCM =830

Scenario 3:
Code: Select all
void func()
{
 PAPI_start()
do_something()
PAPI_end()
}
int main()
{
for i 0 to 500
call func()
end for
sum_the_results()
}

Resuls = TOT_CYC=171056 TOT_INS=128893 L2_DCM=445


Questions:
1. Scenario 1 seems to consume a LOT of CYC, but the tot INS looks too less than Scenario 2 or 3.
2. I noticed that in case of Scenario 2 and Scenario 3, 1st iteration alone seems to take "abnormally" lot of CYC and INS. Is there a specific reason? For example, in Scenario 3, 1st iteration was approx 13,000 CYC and 4200 INS, vs rest of the iterations which were fairly constant around 450 CYC and 250 INS.
3. Whats the overhead of making PAPI_start() and PAPI_stop calls ? Also, when calling PAPI_stop, does it also count the CYC and INS that it ended up using?

thanks,
J.Joba
Jingcha
 
Posts: 2
Joined: Thu Apr 26, 2012 5:30 pm

Re: PAPI results vary

Postby andreasscc » Fri May 04, 2012 2:56 am

I can't say anything to the specific differences in your code. Maybe Compiler optimization?

Speaking about benchmarking in general. Well, especially at level 2 cache my results vary. When i benchmark under Linux with perf it has to do witch background tasks. Even the most simple background task creates a lot of "noise". I also use an lvm (i installed fedora and forgot to uncheck lvm) and this is also cpu intensive.

Regards,
Andreas
andreasscc
 
Posts: 5
Joined: Tue Apr 03, 2012 10:43 am

Re: PAPI results vary

Postby Jingcha » Fri May 04, 2012 6:28 pm

Well, I compiled the code with -O0.
Secondly. why would compiler optimization cause such a huge difference in the values, just by moving the PAPI calls across?
Jingcha
 
Posts: 2
Joined: Thu Apr 26, 2012 5:30 pm

Re: PAPI results vary

Postby Fantome » Tue Jun 25, 2013 7:46 am

When i work with PAPI, i compiled with -O0 too but results vary also if i change this flag in -O1, -O2 or -O3 .
So i supposed compiler change order of instructions and can affect results we have .
Maybe, load instructions, and all the branch instructions affect a lot this results too and the way compiler choose depending flags is the reason of this differences ?
Fantome
 
Posts: 4
Joined: Tue Jun 25, 2013 4:50 am
Location: Europe (France)

Re: PAPI results vary

Postby Fantome » Thu Jun 27, 2013 5:07 am

Now, i understand why results vary with optimizations flags of gcc : if a compiler notices that a result is nt used, depending optimizations flags, the compiler can choose don't compute them ! So it's normal resutls vary if this condition is true in our programs .
Nevertheless, it's easy to remediate : one only needs to print the result value or exemple and the compiler have to compute it .
Now, i don't know how much do results can vary if all computations are done and if we change flags optimizations of gcc (-O0, -O1, ...) .
Fantome
 
Posts: 4
Joined: Tue Jun 25, 2013 4:50 am
Location: Europe (France)


Return to General discussion

Who is online

Users browsing this forum: No registered users and 1 guest