From PAPIDocs
Jump to: navigation, search


Statistical Profiling involves periodically interrupting a running program and examining the program counter at the time of the interrupt. If this is done for a reasonable number of interrupting intervals, the resulting program counter distribution will be statistically representative of the execution profile of the program with respect to the interrupting event. Performance tools like UNIX prof sample the program address with respect to time and hash the value into a histogram. At program completion, the histogram is analyzed and associated with symbolic information contained in the executable. GNU prof in conjunction with the –p option of the GCC compiler performs exactly this analysis using the process time as the interrupting trigger. PAPI aims to generalize this functionality so that a histogram can be generated using any countable hardware event as the basis for the interrupt signal.


A PC histogram can be generated on any countable event by calling either of the following low-level functions: C: PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags) PAPI_sprofil(prof, profcnt, EventSet, EventCode, threshold, flags)

Fortran: PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags, check)


  • buf -- pointer to profile buffer array.

bufsiz -- number of entries in *buf. offset -- starting value of lowest memory address to profile. scale -- scaling factor for bin values. EventSet -- The PAPI EventSet to profile when it is started. EventCode -- code of the Event in the EventSet to profile. threshold -- threshold value for the Event triggers the handler. flags -- bit pattern to control profiling behavior. The defined bit values for the flags variable are shown in the table below:

Defined bit Description PAPI_PROFIL_POSIX Default type of profiling. PAPI_PROFIL_RANDOM Drop a random 25% of the samples. PAPI_PROFIL_WEIGHTED Weight the samples by their value. PAPI_PROFIL_COMPRESS Ignore samples if hash buckets get big. PAPI_PROFIL_BUCKET_16 Save samples in 16-bit hash buckets. PAPI_PROFIL_BUCKET_32 Save samples in 32-bit hash buckets. PAPI_PROFIL_BUCKET_64 Save samples in 64-bit hash buckets. PAPI_PROFIL_FORCE_SW Force software overflow in profiling.

  • prof -- pointer to PAPI_sprofil_t structure.

profcnt -- number of buffers for hardware profiling (reserved).

PAPI_profil creates a histogram of overflow counts for a specified region of the application code by using its first four parameters to create the data structures needed by PAPI_sprofil and then calls PAPI_sprofil to do the work. PAPI_sprofil assumes a pre-initialized PAPI_sprofil_t structure and enables profiling for the EventSet based on its value. Note that the EventSet must be in the stopped state in order for either call to succeed. More than one hardware event can be profiled at the same time by making multiple independent calls to these functions for the same EventSet before calling PAPI_start. This can be useful for the simultaneous generation of profiles of two or more related events, for example L1 cache misses and L2 cache misses. Profiling can be turned off for specific events by calling the function for that event with a threshold of zero. On success, these functions return PAPI_OK and on error, a non-zero error code is returned. For more code examples, see profile.c, profile_twoevents.c or sprofile.c in the ctests directory of the PAPI source distribution. For a more extensive description of the parameters in the PAPI_profil call, see the PAPI_profil man page or its html counterpart at: In the following code example, PAPI_profil is used to generate a PC histogram:

  1. include <papi.h>
  2. include <stdio.h>

main() { int retval; int EventSet = PAPI_NULL; unsigned long start, end, length; PAPI_exe_info_t *prginfo; unsigned short *profbuf;

/* Initialize the PAPI library */ retval = PAPI_library_init(PAPI_VER_CURRENT); if (retval != PAPI_VER_CURRENT & retval > 0) {

 fprintf(stderr,"PAPI library version mismatch!0);


if (retval < 0)


if ((prginfo = PAPI_get_executable_info()) == NULL)


start = (unsigned long)prginfo->text_start; end = (unsigned long)prginfo->text_end; length = end - start;

profbuf = (unsigned short *)malloc(length*sizeof(unsigned short)); if (profbuf == NULL)


memset(profbuf,0x00,length*sizeof(unsigned short));

if (PAPI_create_eventset(&EventSet) != PAPI_OK)


/* Add Total FP Instructions Executed to our EventSet */ if (PAPI_add_event(EventSet, PAPI_FP_INS) != PAPI_OK)


if (PAPI_profil(profbuf, length, start, 65536, EventSet, PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16) != PAPI_OK)


/* Start counting */ if (PAPI_start(EventSet) != PAPI_OK)