Multiple issues with PAPI 4.1.2.1 on Pentium4 Xeon

Open discussion of PAPI.

Multiple issues with PAPI 4.1.2.1 on Pentium4 Xeon

Postby ce107 » Tue Mar 29, 2011 3:55 pm

Hi - I realize this is an antiquated system architecture and that traditionally PAPI had trouble with event mappings on P4 but we happen to have a dormant cluster of dual socket 32-bit Pentium4 Xeons that until we get the funds to upgrade will be a development/analysis/testing cluster.

I've got two resurrected machines (cam00 and cam01) that are for all intents and purposes identical (same software, almost the very same hardware) running Ubuntu 10.10 (2.6.35-28-generic) so the kernel provides performance counter support for Netburst machines. We would much rather stick to the stock Ubuntu kernels than patch them with perfctr. The perf utility works on both boxes in userland.

Building PAPI 4.1.2.1 went fine, the trouble starts after that:
    a) papi_avail output (papi_avail.out - attached - identical from both boxes) is missing several events one would expect, based on the output of papi_native_avail - (papi_native_avail.out - attached - identical from both boxes), to be available. For example PAPI seems all the native events for x87 and SSE FP but only PAPI_FP_INS is available (not PAPI_FP_OPS).
    b) papi_event_chooser for presets seems to have trouble counting more than one event at a time for a lot of events (I did not try all):

    Code: Select all
    $ /usr/local/packages/papi/bin/papi_event_chooser PRESET PAPI_TLB_DM
    Event Chooser: Available events which can be added with given events.
    --------------------------------------------------------------------------------
    PAPI Version             : 4.1.2.1
    Vendor string and code   : GenuineIntel (1)
    Model string and code    : Intel(R) Xeon(TM) CPU 2.80GHz (2)
    CPU Revision             : 7.000000
    CPUID Info               : Family: 15  Model: 2  Stepping: 7
    CPU Megahertz            : 2799.945068
    CPU Clock Megahertz      : 2799
    Hdw Threads per core     : 2
    Cores per Socket         : 1
    CPU's per Node           : 4
    Total CPU's              : 4
    Number Hardware Counters : 18
    Max Multiplex Counters   : 512
    --------------------------------------------------------------------------------

        Name        Code    Deriv Description (Note)
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    -------------------------------------------------------------------------
    Total events reported: 0
    event_chooser.c                          PASSED

    The same problem appears when running the utility as root.
    c) While at build-time both systems passed the basic "make test" check, cam01 nowadays gives instead:

    Code: Select all
    # make test
    cd ctests; make CC="gcc" CC_R="gcc -pthread" CFLAGS="-I.. -g -DSTATIC_PAPI_EVENTS_TABLE -DPEINCLUDE=\"/usr/include/linux/perf_event.h\" -D_REENTRANT -D_GNU_SOURCE -Wall -DUSE_COMPILER_TLS  -I/root/build/papi-4.1.2.1/src/libpfm-3.y/include -DSUBSTRATE_USES_LIBPFM -Wextra -DPAPI_NO_MEMORY_MANAGEMENT" TOPTFLAGS="-O0" SMPCFLGS="" OMPCFLGS="-fopenmp" NOOPT="" LDFLAGS=" " LDL="-ldl" LIBRARY="../libpapi.a" papi_api serial forkexec_tests overflow_tests profile_tests attach multiplex_and_pthreads shared
    make[1]: Entering directory `/root/build/papi-4.1.2.1/src/ctests'
    make[1]: Nothing to be done for `papi_api'.
    make[1]: Nothing to be done for `serial'.
    make[1]: Nothing to be done for `forkexec_tests'.
    make[1]: Nothing to be done for `overflow_tests'.
    make[1]: Nothing to be done for `profile_tests'.
    make[1]: Nothing to be done for `attach'.
    make[1]: Nothing to be done for `multiplex_and_pthreads'.
    make[1]: Nothing to be done for `shared'.
    make[1]: Leaving directory `/root/build/papi-4.1.2.1/src/ctests'
    ctests/zero
    PAPI Error: sys_perf_event_open returned error on event #1.  Unix says, Operation not permitted.
    PAPI_FP_INS is not available.
    PAPI Error: Did not find id -1523775091 in the buffer!.
    PAPI Error: get_count_idx_by_id failed for event num 0, id -1523775091.
    zero.c                                   FAILED
    Line # 70
    Error in PAPI_stop: PAPI_ESBSTR

    Test case 0: start, stop.
    -----------------------------------------------
    Default domain is: 1 (PAPI_DOM_USER)
    Default granularity is: 1 (PAPI_GRN_THR)
    Using 20000000 iterations of c += a*b
    -------------------------------------------------------------------------
    Test type    :               1
    PAPI_FP_INS  :                0
    PAPI_TOT_CYC :     657129996304
    Real usec    :           200697
    Real cycles  :        561948228
    Virt usec    :           200507
    Virt cycles  :        561213495
    -------------------------------------------------------------------------
    Verification: none
    PAPI Error: Did not find id -1523775091 in the buffer!.
    PAPI Error: get_count_idx_by_id failed for event num 0, id -1523775091.
    make: *** [test] Error 1

    cam00 still works fine. the perf utility still works fine on both systems and some of the papi tests and utilities still work on cam01. Even if this is a problem that will go away with a reboot it is disturbing none-the-less.
    d) Running the full test suite (run_tests.out.cam0[01].[01] - attached) gives quite a few errors. I've had to manually kill the following tests:
      kufrin
      multiplex2
      multiplex3_pthreads
    as they seemed to be stuck in an infinite loop though I get the distinct impression that if I waited for an hour or more they would fail (this happened with multiplex3_pthreads). Please not that the cam00 test results (run_tests.cam00.0) are somewhat different from the initial cam01 results (run_tests.cam01.0 when the basic test was working) and even more different - as expected - from the recent cam01 results (run_tests.cam01.1).
    e) For some obscure reason the actual Fortran tests exercised on cam00 and cam01 are different - I have not put any energy in figuring out why given the more basic issues I'm facing.
    f)I want to install PerfExpert from TACC on these systems - the installation has gone fine without a problem on Core2 systems. On these systems running the "sniffer" utility (attached alongside its source code driver.c) during the installation process produces a file lcpi.properties (attached - same output on cam00 and cam01) with derived metrics. Running it gives errors (sniffer.out - attached) as well as warnings related to the inability of tracking two counters at the same time (that's how I interpret the last two lines of output):
    Code: Select all
    ERROR: The following events are available but could not be added to the event set:
    PAPI_TOT_INS PAPI_TOT_CYC PAPI_L1_ICA PAPI_TLB_DM PAPI_TLB_IM PAPI_BR_INS PAPI_BR_MSP PAPI_FP_INS

    So this is a show-stopper, even on cam00.

I realize this may be very confusing but any help would be most appreciated.

Constantinos
Attachments
P4trouble.tar.gz
tar file with all files mentioned
(28.4 KiB) Downloaded 24 times
ce107
 
Posts: 3
Joined: Tue Aug 25, 2009 4:23 pm

Re: Multiple issues with PAPI 4.1.2.1 on Pentium4 Xeon

Postby vweaver1 » Fri Apr 01, 2011 2:32 pm

As you've noticed PAPI/perf_events support is not well tested on Pentium4.

Part of this is because perf_event support for Pentium 4 is not very good. It was added in 2.6.35 but various bugs are continuing to be fixed in more recent kernels.

For PAPI not working on one system (but not the other) does perf still work on the one system? Check your "dmesg" for kernel panics, sometimes these can happen that disable perf counters from working until a reboot.

Getting PAPI_FP_OPS support working might require setting an environment variable to select which type of fp ops you want.
vweaver1
 
Posts: 50
Joined: Wed Feb 17, 2010 4:02 pm


Return to General discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron