Manual Reference Pages  - PAPIEX (1)

NAME

papiex - transparently measure hardware performance events of an application with PAPI

CONTENTS

Synopsis
Description
Options
Examples
See Also
Bugs
Authors
Copyright

SYNOPSIS

papiex [-lihVmanukISqsrwxdM] [-p[prefix]] [-f output dir.] [-L event] [-e event] [--] command [args ...]

papiex -- command args ...

papiex command

DESCRIPTION

papiex is a PAPI based program for measuring hardware performance events of an application. It supports both PAPI preset events and native events. It supports multiple threads of execution as well, including pthreads an OpenMP threads.

The default settings are equivalent to typing:

papiex -u -e PAPI_TOT_CYC -e PAPI_FP_INS

If your processor is braindead and doesn’t support counting of floating point instructions (like the UltraSparc II and the original AMD Athlon) then PAPI_TOT_INS is used instead of PAPI_FP_INS.

OPTIONS

-- This is the standard separator telling papiex to terminate its option processing and pass the rest of the command line to the underlying shell. Use this if your application takes command line arguments.

Example: papiex -- ls -a

-l Print a list of the available PAPI presets and native events.

-L event
  Print a full description of event.

-i Print information about the host processor.

-h Print the usage information.

-V Print the version information of papiex, the PAPI library and the PAPI header file papiex was built against.

-m Enable counter multiplexing to measure more than the physical number of counters available. The number of counters can be discovered with the -i flag.

-a Monitor useful events available on the machine automatically. This implicitly enables multiplexing (see the -m flag).

-n Do NOT create ANY thread or process-specific output files. By default, papiex will create these files for multithreaded and multi-process (MPI) programs.

-u Measure hardware counters in user mode. This is the default counting mode.

-k Measure hardware counters in kernel mode.

-I Measure hardware counters in transient mode. This mode may not be supported on your processor. Some CPU’s execute interrupt/TLB miss handlers in an entirely different privelege level. If your processor does not support this level, you will get an error when papiex goes to set up the counters.

-S Measure hardware counters in supervisor mode.

-q Print information in a less verbose format. This is just the counter value followed by the counter name. The only additional information printed is the timing information and any thread identifiers. It is printed right justified with a width of 16 places. This option is current not compatible with -r.

-s This option simply dumps the environment variable/value pairs to stdout and then exits.

-r Report getrusage() information, most of the time this doesn’t work on Linux. This option is current not compatible with -q.

-x Report memory information for the process. Not all statistics will be available on all Linux kernel versions. Currently reported are peak virtual, peak resident, text, library, heap, stack, shared and locked memory. Numbers are in KB. This option is current not compatible with -q.

-d Enable debugging output. Warning! This could be quite long.

-M Use mpiP for MPI profiling. Please set the MPIP environment variable to pass arguments to mpiP. For help on MPIP options, refer to: http://www.llnl.gov/CASC/mpip/#Runtime_Configuration

--no-mpi-gather
  Do not gather per-process data with MPI and output on the front end. Instead, output data on each of the nodes.

--no-mpi-prof
  Do not profile MPI.

--no-io-prof
  Do not profile IO.

--no-gcc-prof
  The GCC family has the capability to automatically instrument code through the use of -finstrument-functions on the compile line. papiex automatically detects this instrumentation points and uses them as caliper points. This option disables this behavior.

--no-ld-library-path
  Do not modifyt the LD_LIBRARY_PATH environment variable under any circumstances.

-p[prefix]
  By default papiex dumps it output in a file/directory named <cmd>.papiex.<host>.<pid>.<instance>. The -p flag causes <prefix> to be prepended to the output name. This is useful for MPI and multithreaded runs. For readibility, it is a good idea to have a separator, such as . (dot), at the end of your prefix. Note! You cannot have a space between -p and the prefix!

-f <output directory>
  All output of papiex is created under the supplied output directory. If the directory does not exist, it is created. By default, all output is placed in a file/directory under the working directory.

-e <event>
  Monitor the event as named. The event is a symbol as listed in the output from either the -l or -L flag. You may specify more than one event. If you specify more than the number of physical registers as listed with the -i flag, you must enable multiplexing with -m otherwise an error will be reported.

EXAMPLES

The simplest use of papiex on a unithreaded, single process program, would be as:

papiex /bin/ls

In the above case, the performance measurement of PAPI_TOT_CYC and PAPI_FP_OPS would be written to stderr. To monitor specific events explicitly, one would do:

papex -e PAPI_L1_DCM -e PAPI_L1_TCM /bin/ls

For multithreaded programs, you would simply invoke papiex as above; the multiple threads are automatically handled. The output is written into a directory named <cmd>.papiex.<host>.<pid>.<instance>. In case you just want a high-level summary with no per-thread output files, you would do:

papiex -n -e PAPI_FP_OPS my-threaded-prog

Observe the -n flag which instructs papiex not to create any output files. Multiple task programs using MPI are automatically handled by papiex, as in:

mpirun -np 4 papiex -f /tmp ./pop

In the above example, the output data files are stored in /tmp/pop.papiex.<host>.<pid>.<instance>. If you also want mpiP to profile the MPI calls, do:

mpirun -np 4 papiex -M ./pop

You can also give a prefix to the output path. For e.g.,

papiex -fmystats_ my-threaded-prog

The command above will create a directory ./mystats_my-threaded-prog.papiex.<host>.<pid>.<instance>. For multithreaded and/or MPI programs, papiex creates per-thread/per-task and global statistics summaries across threads/tasks, which are stored this directory.

To facilitate ease of use, the -a flag is provided. This allows automatic monitoring of available interesting events. To enable multiple events to be monitored with limited counters, multiplexing (-m) is implicitly assumed. E.g.,

papiex -a my-long-program

SEE ALSO

mpiex(1), hpcex(1), PAPI(3), fork(2), getrusage(2), ld.so(8)

BUGS

If you measure an application or process that makes use of the library preloading mechanism AND you disable the following of fork()’s with the -n flag, the child processes will most likely die a horrible death.

Additional bugs should be reported to the OSPAT Mailing List at <ospat-devel@cs.utk.edu>.

AUTHORS

papiex was written by Philip J. Mucci and Tushar Mohan

COPYRIGHT

This software is COMPLETELY OPEN SOURCE. If you incorporate any portion of this software, I would appreciate an acknowledgement in the appropriate places. Should you find PapiEx useful, please considering making a contribution in the form of hardware, software or plain old cash.


PAPIEX (1) May, 2004
Generated by manServer 1.07 from man/papiex.1 using man macros.