Introduction to PAPI-C
From PAPIDocs
Jump to: navigation, search

Component PAPI

Introduction

Component PAPI, or PAPI-C makes the PAPI performance monitoring programming interface available for more than just the hardware performance counters found on the cpu. Performance counters are finding their way into a number of other components of High Performance computing systems, such as network or memory controllers, power or temperature monitors or even specialized processing units that may find their way into future multicore processor implementations.

The primary technical challenge for PAPI is to sever the very tight coupling between the hardware independent layer of PAPI code and the hardware specific code necessary to interface with the counters, and to do this without sacrificing performance. Secondarily, once these two code layers have been functionally separated, the hardware independent, or Framework layer must be modified to simultaneously support multiple hardware dependent substrate layers, or Components.

These changes cannot be accomplished without some modification of the PAPI user interface. We have tried to keep these modifications minimal and transparent, and have been successful at preserving most backward compatibility for applications and tools that just want access to the cpu counters. We have introduced a small number of new APIs and functionality to support the new abstractions of multiple components. We have also modified the function of some APIs and data structures to support a multi-Component landscape. These changes have been tabluated at the bottom of this document, and are discussed below.

This is the inaugural release of PAPI-C. Components (beside the cpu component) currently available include an ACPI component for monitoring temperature where available; a Myrinet MX component; a LM-SENSORS component to monitor a wide variety of system health measurements, and a 'toy' component that monitors network traffic as reported in the linux/unix sbin/ifconfig directory.

API and Abstraction Changes

EventSets

One of the key organizing data structures in PAPI is the EventSet. This serves as a repository for all the events and settings necessary to define a counting regime. EventSets are created, modified, added to, deleted from, and disposed of over the life of a PAPI counting session. In traditional PAPI, multiple EventSets can exist simultaneously, but only one can be active at any time. PAPI-C extends the concept of an EventSet by binding it to a specific numbered Component. This component index then signals which component the EventSet is paired with. Multiple EventSets can be defined and active simultaneously, but only one EventSet per Component can be enabled. We have adopted a late-binding model for associating an EventSet with a Component. No changes are needed in the API call for creating an EventSet, and the Set is bound to a Component when the first event is added. Any additional events must then belong to the same Component. Occasionally it is desirable to modify settings in an EventSet before an event is added. In this case, a new API, PAPI_assign_eventset_component(), has been introduced to make this binding explicit. For now, PAPI Preset events are only defined for the cpu component, which by convention is always component 0.

Events

For now, PAPI Preset events are only defined for the cpu component, which by convention is always component 0. Since these event names and codes are available directly in papi.h, they will continue to work with no modifications. Event codes for other components are always mapped to native events available on that component and are bound to the component with a 4-bit component ID field embedded in the event code itself. These codes cannot be determined a priori, since they are an opaque id used only by PAPI. They must be obtained by a call to PAPI_event_name_to_code(), which will search all available native event tables and return a properly encoded value if the event exists. As described above, the first event added binds an EventSet to a Component; all following added events must belong to the same Component.

Component Housekeeping

A number of changes were made to support various housekeeping chores associated with multiple Components. A new API, PAPI_num_components(), was added to provide the number of active components in the current library. Also, PAPI_get_component_info() replaces PAPI_get_substrate_info() and provides detailed information for a given component. As mentioned above, since the cpu component is always assumed to exist, it is always assigned as component 0. In addition, component 0 is always relied on to provide the high resolution timer functionality behind the following APIs: PAPI_get_real_cyc(), PAPI_get_virt_cyc(), PAPI_get_real_usec(), qnd PAPI_get_virt_usec(). One call, PAPI_num_hwctrs(), still functions as it did in traditional PAPI to provide the number of physical cpu counters. It has been augmented by the new PAPI_num_cmp_hwctrs(), to provide the number of counters for a specified component.

PAPI Options

The bulk of the visible changes in PAPI-C have occured in the general area of setting and getting option values. Options can be either system-wide or component-specific. This didn't matter in traditional PAPI with only one component. Now it does. In order to preserve backward compatibility with code that only accesses the cpu component, the PAPI_get_opt() and PAPI_set_opt() calls behave as before, with an implicit component index of 0 for those options that are bound to a component. For those options that are component specific, PAPI_get_cmp_opt() and PAPI_set_cmp_opt() take an addition component index argument. Futher, two new convenience functions, PAPI_set_cmp_domain() and PAPI_set_cmp_granularity() have been added for component specific setting of these options. More subtly, two of the cases handled by PAPI_set_opt() now have additional information included in the passed data structures. Both PAPI_DEFDOM and PAPI_DEFGRN cases now require a component index to be provided in the passed data structure, since available domains are component dependent and may differ widely between cpu domains and, for example, network domains.

Building and Linking

Configuring Components

There are very few visible changes in the build environment. As before, cpu components are automatically detected by configure and included in the build. As new components are added, each is supported by a
--with-<cmp> = yes
option on the configure command line. Currently supported component options include:
--with-acpi = yes
--with-mx = yes
--with-net = yes

It is intended that in the future, where possible, component support will be autodetected by configure in a fashion similar to cpu architectures and automatically included in the make.

The make process currently compiles and links the sources for all requested components into a single binary. This process is automatic and transparent once the components are specified in the configure step. It is intended that future releases will make each component independently and allow for dynamic component loading at runtime.

Downloading PAPI-C from CVS

With the release of PAPI 3.7, the roles of PAPI Classic and PAPI-C have been reversed. PAPI-C is now in the HEAD branch of the cvs repository and PAPI Classic is now in the papi-3-7-0 branch. To obtain the source for PAPI-C, do the following:

cvs -d :pserver:anonymous@cvs.eecs.utk.edu:/cvs/homes/papi co papi

Downloading PAPI Classic is just a bit more involved:

cvs -d :pserver:anonymous@cvs.eecs.utk.edu:/cvs/homes/papi co -r papi-3-7-0 papi

Application Changes

Very few changes are needed to run existing PAPI-enabled applications under PAPI-C. The discussion below highlights the changes we found necessary in porting our test applications to the modified API:

  • Any calls to PAPI_get_substrate_info() must be converted to calls to PAPI_get_component_info() with a corresponding change in the type of the returned data structure. Correspondingly, calls to PAPI_get_opt(SUBSTRATE_INFO) should be changed to PAPI_get_opt(COMPONENT_INFO) or PAPI_get_cmp_opt(COMPONENT_INFO,0).
  • If an application creates an EventSet and then tries to set the domain or the mutliplex options before adding events, the code will error. The fix is to call PAPI_assign_eventset_component() with the desired component prior to setting options.
  • Calls to PAPI_set_opt() with either PAPI_DEFDOM or PAPI_DEFGRN options must set the def_cidx field in the passed data structure.

Summary of Changes

New APIs:

  • const PAPI_component_info_t *PAPI_get_component_info(int cidx)
    • given a valid index, returns a component info structure as defined in papi.h
    • returns NULL if out of range.
    • Replaces PAPI_get_substrate_info()
  • int PAPI_num_components( void )
  • int PAPI_assign_eventset_component(int EventSet, int cidx)
    • Explicitly bind an eventset to a component before events are added.
    • Occasionally needed prior to manipulating eventset parameters like domain or multiplexing.
  • int PAPI_set_cmp_domain(int domain, int cidx')
  • int PAPI_set_cmp_granularity(int granularity, int cidx)
  • int PAPI_num_cmp_hwctrs(int cidx)
  • int PAPI_get_cmp_opt(int option, PAPI_option_t * ptr, int cidx)
    • Handles options that explicitly require a component index:
      • PAPI_DEF_MPX_USEC (shouldn't this one be system level?)
      • PAPI_MAX_HWCTRS
      • PAPI_MAX_MPX_CTRS
      • PAPI_DEFDOM
      • PAPI_DEFGRN
      • PAPI_SHLIBINFO (shouldn't this one be system level?)
      • PAPI_COMPONENTINFO

Modified API Functionality:

  • int PAPI_enum_event(int *EventCode, int modifier)
    • Parses EventCode for component index.
    • Enumerates only across component specified in EventCode
  • int PAPI_create_eventset(int *EventSet)
    • Eventsets are bound to components. This is ordinarily a late-binding process that occurs when an event is added.
  • int PAPI_set_domain(int domain')
    • Implicitly sets domain of component 0; deprecated - maintained for backward compatibility.
  • int PAPI_set_granularity(int granularity)
    • Implicitly sets granularity of component 0; deprecated - maintained for backward compatibility.
  • int PAPI_set_opt(int option, PAPI_option_t * ptr)
    • The PAPI_DEFDOM and PAPI_DEFGRN options now include a mandatory component index field in the data structure
  • int PAPI_num_hwctrs(void)
    • Implicitly returns number of counters for component 0; deprecated - maintained for backward compatibility.
  • int PAPI_get_opt(int option, PAPI_option_t * ptr)
    • Behaves as before for options that don't require a component index;
    • Implicitly returns values for component 0 for the following options:
      • PAPI_DEF_MPX_USEC
      • PAPI_MAX_HWCTRS
      • PAPI_MAX_MPX_CTRS
      • PAPI_DEFDOM
      • PAPI_DEFGRN
      • PAPI_SHLIBINFO (shouldn't this one be system level?)
      • PAPI_COMPONENTINFO
  • The following 4 APIs always call the timer functions found in the cpu component (component 0):
    • PAPI_get_real_cyc()
    • PAPI_get_virt_cyc()
    • PAPI_get_real_usec()
    • PAPI_get_virt_usec()

Structural Changes:

  • Component 0 is always assumed to be the traditional cpu counter component.
  • Event codes now contain an embedded 4 bit COMPONENT_INDEX field to id one of 16 components
    • No error checking is done yet to guarantee less than 16 components.
  • The PAPI_SUBSTRATEINFO case for PAPI_get_opt has been changed to PAPI_COMPONENTINFO
    • multiplex info was moved from the component level to a separate PAPI_mpx_info_t structure included at the hardware info level.
  • The PAPI_domain_option_t and PAPI_granularity_option_t structures now have a component index field, def_cidx, required when setting default domains or granularities.

New Error Messages:

  • PAPI_ENOINIT - PAPI hasn't been initialized yet
  • PAPI_ENOCMP - Component Index isn't set or out of range