Performance of Linux' new perf_counters compared to perfctr?

Open discussion of PAPI.

Performance of Linux' new perf_counters compared to perfctr?

Postby cracauer » Thu Sep 10, 2009 6:19 pm

Back in the day, when I compared how long perfmon and perctr took to give me some basic readings at runtime (not setup time), I found perctr to be much faster than perfmon.

I can dig out the actual numbers if anybody needs to know and they escape me right now, but it was more than an order of magnitude. More importantly, perfctr allowed me to get the registers faster than gettimeofday, which allowed me to just replace a whole bunch of time-only monitoring code with perctr. This is through PAPI, BTW. Perfmon was more like an ultra-heavy system call.

Did anybody measure how the new 2.6.30 perf_counters compare? I asked on the perctr list but (understandably) they didn't have that info ready.
cracauer
 
Posts: 3
Joined: Thu Sep 10, 2009 5:33 pm

Re: Performance of Linux' new perf_counters compared to perfctr?

Postby cjashfor » Thu Sep 10, 2009 8:48 pm

Using the standard mechanism provided by perf_counters for accessing counts from any thread, it's going to be about as heavy-weight as perfmon2. However, perf_counters does have a high-speed way of reading counters from a self-monitoring thread. This is done by reading the low-order bits directly from the counter register, and the high-order bits (which are maintained by the kernel's bookeeping of counter overflow interrupts) via mmap'd memory.

Currently, this high-speed access is not used by the PAPI substrate for perf_counters (aka PCL), though this could be added as an optimization at a later time.
cjashfor
 
Posts: 6
Joined: Wed Aug 26, 2009 2:25 am

Re: Performance of Linux' new perf_counters compared to perfctr?

Postby cracauer » Fri Sep 11, 2009 12:11 pm

Thanks, cjashfor. Kind of disappointing, I was hoping to keep at perfctr levels.

When perf_counters reads the bits in a self-monitoring thread, has the kernel been taking care of moving the contents of these registers between CPUs as the thread moves around?
cracauer
 
Posts: 3
Joined: Thu Sep 10, 2009 5:33 pm

Re: Performance of Linux' new perf_counters compared to perfctr?

Postby cjashfor » Fri Sep 11, 2009 1:26 pm

Yes, maintaining counts per thread is one thing perf_counters does. It can also do multiplexing of events onto counters, so that if you have more events than counters, you can get decent approximations of counts.

As for performance, I would bet that perf_counters is as fast as perfctrs when self-monitoring using the mmap'd memory mechanism. I don't know how perfctrs could be faster on remote monitoring (or perhaps perfctrs doesn't support remote monitoring?) When doing remote monitoring, somehow you need to get the counter values from other threads, and that requires at least some sort of IPI mechanism, which is pretty heavy-weight.
cjashfor
 
Posts: 6
Joined: Wed Aug 26, 2009 2:25 am

Re: Performance of Linux' new perf_counters compared to perfctr?

Postby cracauer » Wed Oct 14, 2009 1:22 pm

Thanks again.

In case I wasn't, clear, I do self-monitoring only. My application has checkpoint markers that cons up what's happening within sections marked by symbols.

Has anything changed WRT the reading of counters using perf_counters partially mmap'ed interface? I have the CVS version of PAPI here but it isn't immediately obvious.
cracauer
 
Posts: 3
Joined: Thu Sep 10, 2009 5:33 pm

Re: Performance of Linux' new perf_counters compared to perfctr?

Postby Dan Terpstra » Wed Oct 14, 2009 2:02 pm

No. We're still using the standard interface. Since you're interested in timings, you could use the PAPI cost utility (papi/src/utils/papi_cost) to do some quick measurements. I just tested our Nehalem 2.6.31 with perfctr and perf_counters. Start/Stop for 2 counters is roughly 5300 cycles for perfctr and roughly 22,000 cycles for perf_counters. Similarly, reading 2 counters is a mere 137 cycles with perfctr and about 5600 cycles with perf_counters.
Dan Terpstra
 
Posts: 57
Joined: Mon Aug 24, 2009 5:42 pm


Return to General discussion

Who is online

Users browsing this forum: Exabot [Bot] and 1 guest