1)When PAPI tries to determine the IPC, does it just keep track of the program it self or is the kernel included? If the kernel is not included, Is it possible to keep track of the kernel as well (system calls)?
By default PAPI only counts activity in user space, but you can use the PAPI_set_domain function to specify counting in the kernel too. In this case, you would specify PAPI_DOM_USER | PAPI_DOM_KERNEL as the argument.
2)What happens if threads cannot be pinned down to processors, and there is tons of thread migration, is this an issue?
If you start and stop PAPI within each thread, thread migration shouldn't be a problem. At context switch, the counter values for the thread being switched out are saved, and the values for the thread being switched in are restored. Each thread will keep track of it's own activity.
3)I am using heavily pipe lined parallel programs is this an issue when trying to determine events such as IPC, cant I just create a situation where papi counters are cleared prior to program execution on all processors, then read once the program stops executing? (this way I can have kernel interactions included, as well as other threads..etc, if this is possible I assume accuracy is sacrificed).
As long as each thread starts and stops PAPI independently, each thread will maintain its own counts. It will be up to you to sum up the counts for all these threads at the end. You may not be able to use the high level PAPI_ipc call to do this. Instead, use low level calls to collect instructions retired (PAPI_TOT_INS) and cycles (PAPI_TOT_CYC) and then divide to get IPC.