PAPI Frequently Asked Questions

last edited: 9/3/02

This is the PAPI FAQ or Frequently Asked Questions List. The user should look here for information on PAPI that is not otherwise cataloged elsewhere on the site.

1.  What are the mailing lists and how do I subscribe?

There are currently two mailing lists, ptools-perfapi@ptools.org which is a group for general announcements, questions and miscellaneous topics and perfapi-devel@ptools.org which is a discussion group for the developers of PAPI and it receives all CVS update messages (which can be a significant amount of mail!)

To subscribe to either of the above groups, or to have your address removed from a list, send a message containing one of the lines:

subscribe listname
unsubscribe listname

in the body of your message to majordomo@ptools.org.  Replacing listname with either ptools-perfapi or perfapi-devel.

2.  How do I stop PAPI_overflow, PAPI_profile or PAPI_sprofil?

You must call the above with the handler or buffer to NULL and the threshold to 0 after having called PAPI_stop.

3.  Why does PAPI_overflow, PAPI_profil and PAPI_sprofil work strangely with a small threshold?

On most systems, overflow must be emulated in software by PAPI. Only on the UltraSparc III and IRIX does the operating system support true interrupt on overflow. Therefore the user is advised on most platforms to make sure the overflow value is no more than 1/1000th the clock rate. The emulation handler in PAPI runs every millisecond, therefore the goal of the tool designer should be to pick an value that will overflow frequently but not too frequently. Not following these guidelines could result in either the overflows never occurring or overflows occurring on every interrupt and thus resulting in a flat profile.

4.  Why can't I get my Fortran programs to compile with PAPI on a Cray T3E?

The Fortran header file you include has to be preprocessed before the Fortran file can use it. To have the cpp process the file before sending the file to the compiler, add the -F flag. For example:

f90 -F test.F -o test

5. How do I encode a native event?

Unless otherwise stated in the README file for your platform, the encoding is as follows:

event = ((reg_code & 0xffffff) << 8 | (reg_num & 0xff))

6. What's wrong with PAPI_LST_INS (hex code 0x43) on my Pentium?

According to the Intel documentation, the counts from this event are not intuitive relating to it's description. Older releases of PAPI had this preset available in the Intel ports, but no longer. It does appear to work on the AMD Athlon.

7. Does PAPI support unbound or non-kernel threads?

Yes, but the counts will reflect the total events for the process. Measurements done in other threads will all get the same values, namely those counts for the total process. For non-bound threads, it is not necessary to call PAPI_thread_init. But in most scenarios like with SMP or OpenMP compiler directives, bound threads will be the default. For those using Pthreads, the user should take care to set the scope of each thread to PTHREAD_SCOPE_SYSTEM attribute, unless the system is known to have a non hybrid thread library implementation, like Linux.

8. The numbers are funky for event 0xabc on platform XYZ, help me!

This is not a question, but I'll help you. We the PAPI developers cannot be experts on the 1000's of events found across all supported platforms. However, if you are using a PAPI preset, the first thing to do is to look up the corresponding native event code using the test case 'avail'. Then the best bet is to always go to the vendor's technical documentation site and check the processor reference manual. If you're convinced everything is kosher, then please feel free to send a message to the mailing list and one of the members may be able to help you.

9. My program runs fine with 1 or 2 counters, but when I add more I get a -8, PAPI_ECNFLCT error code. The error text says, "Event exists. but cannot be counted due to hardware resource limitations". What does this mean?

Many systems have only a few hardware performance counter registers thus you can only measure a few metrics at once. Some platforms may support counter multiplexing, which gives the user the illusion of a larger number of registers by time sharing the performance registers. On the R10K series, the IRIX kernel supports multiplexing, allowing up to 32 events to be counted at once. Don't take fine grained measurements when multiplexing, unless you know what you're doing.

10. What's multiplexing?

See Question 9.

11. My Sun box doesn't have libcpc.h?

You didn't check the Platform Matrix at http://icl.cs.utk.edu/papi/software/platforms.ssi. The hardware counters on SunOS withUltraSparc are only available on Sun OS 5.8 and above. That's Solaris 2.8 for you SVR4 people.

12. What about a port to IA-64?

Not yet, but we are working on one. If you have an IA-64 box and want to test the IA-64 port, send email to ptools-perfapi@ptools.org.

13. What is needed to use PAPI?

See the latest Platform page at http://icl.cs.utk.edu/papi/software/platforms.ssi.

If you have a question that you think should be added here, send it to ptools-perfapi@ptools.org.

14. What tools are available for PAPI?

See the latest Reference page at http://icl.cs.utk.edu/projects/documents/. If you have a tool to be posted, send it to the mailing list.

15. Why is there more than one patch for Linux?

There are numerous patches designed to provide access to the Intel CPU performance counters. As PAPI began, we used the original Beowulf patch (perf) by David Hendriks. However, as PAPI progressed, we needed some addition features, which I graciously added. This patch used a system call approach and has proven to be exceedingly stable. Yes, no crashes reported. I knew that there  was a better way to designed a performance counter kernel patch, one that used mmap() to provide direct access to the virtual counts. Mikael Pettersson provided me with exactly that in the form of the perfctr patch. It is also very, very stable. It can be found at http://www.docs.uu.se/~mikpe/linux/perfctr. If you're starting with PAPI for the first time, we recommend the perfctr patch as included in the papi source distribution.

16. How does PAPI handle threads?

Currently, PAPI only supports thread level measurements with kernel or bound threads. Each thread must manipulate and read its own counters. When a thread is created, it inherits no PAPI information from the calling thread. Support for this kind of functionality is being implemented.

17. How does PAPI handle fork/exec?

When a process is created, it inherits no PAPI information from the calling thread. Support for this kind of functionality is being implemented.

18. What events does PAPI track?

PAPI only tracks 'hardware events', the occurrence of signals onboard the microprocessor. It does not count system calls, software interrupts or other software events. The user should remember that by default, PAPI only measures events that occur in User Space.

19. Why am I still getting PAPI_ECNFLCT when using multiplexing?

PAPI currently uses one hardware counter for Total Cycles in the multiplexing code. If you are trying to multiplex a derived event on hardware with only two physical counters then you will get the PAPI_ECNFLCT error. This happens on the Intel Pentiums for example. This will be fixed in a future release on systems that have a high resolution virtual timer.


  Innovative Computing Laboratory
2001 R&D Winner  
Contact PAPI: papi@cs.utk.edu Computer Science Department
  University of Tennessee