The L3 cache on Nehalem is part of what Intel calls the Uncore. There are special uncore counters and a bunch of uncore events. As of now, these counters can only be accessed in global or system-wide counting modes, and PAPI operates in local or thread/process specific counting mode. This is an area of open research: how can chip level resources be reserved at the process level to allow per thread measurement of shared activity.
Meanwhile, there is one native event: OFFCORE_RESPONSE_0 which has the ability to measure some stuff related to the L3 cache. It's available in the current cvs pull of PAPI and will soon be available in the next release version of PAPI.