PAPI Frequently Asked Questions
last edited: 9/3/02
This is the PAPI FAQ or Frequently Asked Questions List. The user should look here for
information on PAPI that is not otherwise cataloged elsewhere on the site.
1. What are the mailing lists and how do I subscribe?
There are currently two mailing lists, ptools-perfapi@ptools.org
which is a group for general announcements, questions and miscellaneous topics and perfapi-devel@ptools.org which is a discussion
group for the developers of PAPI and it receives all CVS update messages (which can be a
significant amount of mail!)
To subscribe to either of the above groups, or to have your address removed from a
list, send a message containing one of the lines:
subscribe listname
unsubscribe listname
in the body of your message to majordomo@ptools.org.
Replacing listname with either ptools-perfapi or perfapi-devel.
2. How do I stop PAPI_overflow, PAPI_profile or PAPI_sprofil?
You must call the above with the handler or buffer to NULL and the threshold to 0 after
having called PAPI_stop.
3. Why does PAPI_overflow, PAPI_profil and PAPI_sprofil work strangely with a
small threshold?
On most systems, overflow must be emulated in software by PAPI. Only on the UltraSparc
III and IRIX does the operating system support true interrupt on overflow. Therefore the
user is advised on most platforms to make sure the overflow value is no more than 1/1000th
the clock rate. The emulation handler in PAPI runs every millisecond, therefore the goal
of the tool designer should be to pick an value that will overflow frequently but not too
frequently. Not following these guidelines could result in either the overflows never
occurring or overflows occurring on every interrupt and thus resulting in a flat profile.
4. Why can't I get my Fortran programs to compile with PAPI on a Cray T3E?
The Fortran header file you include has to be preprocessed before the Fortran file can
use it. To have the cpp process the file before sending the file to the compiler, add the
-F flag. For example:
f90 -F test.F -o test
5. How do I encode a native event?
Unless otherwise stated in the README file for your platform, the encoding is as
follows:
event = ((reg_code & 0xffffff) << 8 | (reg_num & 0xff))
6. What's wrong with PAPI_LST_INS (hex code 0x43) on my Pentium?
According to the Intel documentation, the counts from this event are not intuitive
relating to it's description. Older releases of PAPI had this preset available in the
Intel ports, but no longer. It does appear to work on the AMD Athlon.
7. Does PAPI support unbound or non-kernel threads?
Yes, but the counts will reflect the total events for the process. Measurements done in
other threads will all get the same values, namely those counts for the total process. For
non-bound threads, it is not necessary to call PAPI_thread_init. But in most scenarios
like with SMP or OpenMP compiler directives, bound threads will be the default. For those
using Pthreads, the user should take care to set the scope of each thread to
PTHREAD_SCOPE_SYSTEM attribute, unless the system is known to have a non hybrid thread
library implementation, like Linux.
8. The numbers are funky for event 0xabc on platform XYZ, help me!
This is not a question, but I'll help you. We the PAPI developers cannot be experts on
the 1000's of events found across all supported platforms. However, if you are using a
PAPI preset, the first thing to do is to look up the corresponding native event code using
the test case 'avail'. Then the best bet is to always go to the vendor's technical
documentation site and check the processor reference manual. If you're convinced
everything is kosher, then please feel free to send a message to the mailing list and one
of the members may be able to help you.
9. My program runs fine with 1 or 2 counters, but when I add more I get a -8,
PAPI_ECNFLCT error code. The error text says, "Event exists. but cannot be counted
due to hardware resource limitations". What does this mean?
Many systems have only a few hardware performance counter registers thus you can only
measure a few metrics at once. Some platforms may support counter multiplexing, which
gives the user the illusion of a larger number of registers by time sharing the
performance registers. On the R10K series, the IRIX kernel supports multiplexing, allowing
up to 32 events to be counted at once. Don't take fine grained measurements when
multiplexing, unless you know what you're doing.
10. What's multiplexing?
See Question 9.
11. My Sun box doesn't have libcpc.h?
You didn't check the Platform Matrix at http://icl.cs.utk.edu/papi/software/platforms.ssi.
The hardware counters on SunOS withUltraSparc are only available on Sun OS 5.8 and above.
That's Solaris 2.8 for you SVR4 people.
12. What about a port to IA-64?
Not yet, but we are working on one. If you have an IA-64 box and want to test the
IA-64 port, send email to ptools-perfapi@ptools.org.
13. What is needed to use PAPI?
See the latest Platform page at http://icl.cs.utk.edu/papi/software/platforms.ssi.
If you have a question that you think should be added here, send it to ptools-perfapi@ptools.org.
14. What tools are available for PAPI?
See the latest Reference page at http://icl.cs.utk.edu/projects/documents/.
If you have a tool to be posted, send it to the mailing list.
15. Why is there more than one patch for Linux?
There are numerous patches designed to provide access to the Intel CPU performance
counters. As PAPI began, we used the original Beowulf patch (perf) by David Hendriks.
However, as PAPI progressed, we needed some addition features, which I graciously added.
This patch used a system call approach and has proven to be exceedingly stable. Yes, no
crashes reported. I knew that there was a better way to designed a performance
counter kernel patch, one that used mmap() to provide direct access to the virtual counts.
Mikael Pettersson provided me with exactly that in the form of the perfctr patch. It is
also very, very stable. It can be found at http://www.docs.uu.se/~mikpe/linux/perfctr.
If you're starting with PAPI for the first time, we recommend the perfctr patch as
included in the papi source distribution.
16. How does PAPI handle threads?
Currently, PAPI only supports thread level measurements with kernel or bound threads.
Each thread must manipulate and read its own counters. When a thread is created, it
inherits no PAPI information from the calling thread. Support for this kind of
functionality is being implemented.
17. How does PAPI handle fork/exec?
When a process is created, it inherits no PAPI information from the calling thread.
Support for this kind of functionality is being implemented.
18. What events does PAPI track?
PAPI only tracks 'hardware events', the occurrence of signals onboard the
microprocessor. It does not count system calls, software interrupts or other software
events. The user should remember that by default, PAPI only measures events that occur in
User Space.
19. Why am I still getting PAPI_ECNFLCT when using multiplexing?
PAPI currently uses one hardware counter for Total Cycles in the multiplexing code. If
you are trying to multiplex a derived event on hardware with only two physical counters
then you will get the PAPI_ECNFLCT error. This happens on the Intel Pentiums for example.
This will be fixed in a future release on systems that have a high resolution virtual timer.
|