The major contributions of MuMI
will include the following:
- Extension of Prophesy’s performance modeling
interface and database component to encompass multicore and to incorporate
power parameters and metrics into the performance models.
- Extension of PAPI’s widely used hardware
performance monitoring library to include the collection and interpretation of
relevant data for various components of multicore systems.
- Extension of the PowerPack power-performance
measurement, profiling, analysis and optimization framework to multicore
architectures, enabling measurement of power consumption at component (e.g.
processor core) and function-level granularities.
- Development of modeling and analysis techniques
that can be used to explore the performance and power optimization space of
multicore systems, especially targeting resource contention issues.
On multicore systems, limitations
related to sharing of resources can adversely affect the ability of
applications to scale to use all available cores. Types of resource contention that can occur
include shared cache contention, memory bus contention, and network interface
contention. In the first few months of
the project, we have designed a methodology to use hardware counters to detect
and diagnose various types of resource contention. We have used this methodology in initial
experiments to detect cache and memory bus contention when running the NAS
Parallel Benchmarks (version 2.3 with OpenMP) on a 16-cire Intel Tigerton
system. Each socket has a quad-core chip
and each chip has two dual-core shared L2 caches. We detected a significant amount of L2 cache
contention with some of the benchmarks. The detailed results will be published in a technical report.
For online discussion about the MuMI project, see the MuMI wiki at http://wiki.mumi-tool.org/