PAPI and PTHREADS

Open discussion of PAPI.

PAPI and PTHREADS

Postby papi-newbie » Mon Apr 26, 2010 6:09 pm

Hello, I have a few questions about PAPI integration into Pthreads.

Does anyone know any good references aside from the examples programs in papi source code and the user guide, on how Papi and pthreads work?

I am trying to use PAPI to determine IPC of parallel programs/applications.


1)When PAPI tries to determine the IPC, does it just keep track of the program it self or is the kernel included? If the kernel is not included, Is it possible to keep track of the kernel as well (system calls)?

2)What happens if threads cannot be pinned down to processors, and there is tons of thread migration, is this an issue?

3)I am using heavily pipe lined parallel programs is this an issue when trying to determine events such as IPC, cant I just create a situation where papi counters are cleared prior to program execution on all processors, then read once the program stops executing? (this way I can have kernel interactions included, as well as other threads..etc, if this is possible I assume accuracy is sacrificed).

I am not that interested in having highly accurate result, I just want to a ball park number on my benchmarks (I am only focusing on the IPC at the moment).

Thanks!
papi-newbie
 
Posts: 3
Joined: Mon Sep 21, 2009 6:29 pm

Re: PAPI and PTHREADS

Postby Dan Terpstra » Tue Apr 27, 2010 11:39 am

1)When PAPI tries to determine the IPC, does it just keep track of the program it self or is the kernel included? If the kernel is not included, Is it possible to keep track of the kernel as well (system calls)?

By default PAPI only counts activity in user space, but you can use the PAPI_set_domain function to specify counting in the kernel too. In this case, you would specify PAPI_DOM_USER | PAPI_DOM_KERNEL as the argument.
2)What happens if threads cannot be pinned down to processors, and there is tons of thread migration, is this an issue?

If you start and stop PAPI within each thread, thread migration shouldn't be a problem. At context switch, the counter values for the thread being switched out are saved, and the values for the thread being switched in are restored. Each thread will keep track of it's own activity.
3)I am using heavily pipe lined parallel programs is this an issue when trying to determine events such as IPC, cant I just create a situation where papi counters are cleared prior to program execution on all processors, then read once the program stops executing? (this way I can have kernel interactions included, as well as other threads..etc, if this is possible I assume accuracy is sacrificed).

As long as each thread starts and stops PAPI independently, each thread will maintain its own counts. It will be up to you to sum up the counts for all these threads at the end. You may not be able to use the high level PAPI_ipc call to do this. Instead, use low level calls to collect instructions retired (PAPI_TOT_INS) and cycles (PAPI_TOT_CYC) and then divide to get IPC.
Dan Terpstra
 
Posts: 57
Joined: Mon Aug 24, 2009 5:42 pm

Re: PAPI and PTHREADS

Postby papi-newbie » Tue Apr 27, 2010 9:37 pm

Ok when I set the domain to Kernel | USER
the values become nan, it ceases to work. (im using low level papi constructs to determine IPC)

Any ideas?
Im using PAPI 3.7.0
on: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz
papi-newbie
 
Posts: 3
Joined: Mon Sep 21, 2009 6:29 pm

Re: PAPI and PTHREADS

Postby Dan Terpstra » Tue Apr 27, 2010 10:01 pm

Does the ctests/second.c test work on your system? That test illustrates the use of PAPI_set_domain to measure kernel space. If it works you may be able to figure out from it how to count kernel space. Caveat emptor: if it doesn't work, this feature may not be supported on your architecture.
Dan Terpstra
 
Posts: 57
Joined: Mon Aug 24, 2009 5:42 pm

Re: PAPI and PTHREADS

Postby papi-newbie » Wed Apr 28, 2010 9:21 pm

That test failed. Papi 3.7.2 doesnt seem to make a difference. Its weird because if I manipulate the amount of threads say 2 instead or 4 or 8 (I have an 8 core machine), I can get actual values, and not NaNs.

The prog runs for 30 seconds, so I dont think its overflowing. Maybe I should not even trust the results for 2, or 4 threads.

Not sure whats going on.


Thanks for your help!
papi-newbie
 
Posts: 3
Joined: Mon Sep 21, 2009 6:29 pm


Return to General discussion

Who is online

Users browsing this forum: No registered users and 2 guests