I've got a single threaded program using the high level PAPI interface to count the number of executed ASM instruction of certain code sections. I am basically using only the following two functions: PAPI_start_counters; PAPI_stop_counters. Now I was wondering if it is possible to execute the same program at the same time on a multicore (AMD) CPU and still to obtain correct measurements, or would the implicit initialization (or something else) in the aforementioned two high level functions mess up everything. For example, on a modern AMD quad-core, if I start less than four instances of the program, would PAPI in each program use a different hardware-counter so that the multiple instances do not interfere with each other?
(I hope that my explanation of the problem is not too confusing).