jagode00 wrote:It sounds like you only initialize PAPI for the main thread before forking a child thread. Did I understand you correctly? If so then it makes sense that you only see misses for the parent thread. When the main thread creates a child thread then the child does not inherit any PAPI information from the calling thread. Have a look at the ctests/fork.c example.
heike
Thanks.

That's almost what I was up to -- I had attached PAPI to the forked child. But, I was using execl("/bin/bash", "/bin/bash", "-c", argv[cmd_arg_idx]) to get metrics for a bash command for the benchmarks. For commands that don't use pipes or indirection, this seemed to work okay. But, for more complex commands -- especially those that use input indirection with '< infile', it turned out I needed to track down bash's own spawned children with ps and attach to them instead. I wound up calling ~ popen("ps --ppid $CHILD_PID -o h") (with some sprintf(...) to write the pid from fork into the string...) to get the list of grandchildren.