From PAPIDocs
Jump to: navigation, search



PAPI is an acronym for Performance Application Programming Interface. The PAPI Project is being developed at the University of Tennessee’s Innovative Computing Laboratory in the Computer Science Department. This project was created to design, standardize, and implement a portable and efficient API (Application Programming Interface) to access the hardware performance counters found on most modern microprocessors.


Hardware counters exist on every major processor today, such as Intel Pentium, Core, IA-64, AMD Opteron, and IBM POWER series. These counters can provide performance tool developers with a basis for tool development and application developers with valuable information about sections of their code that can be improved. However, there are only a few APIs that allow access to these counters, and many of them are poorly documented, unstable, or unavailable. In addition, performance metrics may have different definitions and different programming interfaces on different platforms.

These considerations motivated the development of the PAPI Project. Some goals of the PAPI Project are as follows:

To provide a solid foundation for cross platform performance analysis tools

To present a set of standard definitions for performance metrics on all platforms

To provide a standardize API among users, vendors, and academics

To be easy to use, well documented, and freely available


The Figurebelow shows the internal design of the PAPI architecture. In this figure, we can see the two layers of the architecture:

The Portable Layer consists of the API (low level and high level) and machine independent support functions.

The Machine Specific Layer defines and exports a machine independent interface to machine dependent functions and data structures. These functions are defined in the substrate layer, which uses kernel extensions, operating system calls, or assembly language to access the hardware performance counters. PAPI uses the most efficient and flexible of the three, depending on what is available.

PAPI strives to provide a uniform environment across platforms. However, this is not always possible. Where hardware support for features, such as overflows and multiplexing is not supported, PAPI implements the features in software where possible. Also, processors do not support the same metrics, thus you can monitor different events depending on the processor in use. Therefore, the interface remains constant, but how it is implemented can vary. Throughout this guide, implementation decisions will be documented where it can make a difference to the user, such as overhead costs, sampling, and etc.