Malony, A., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Duncan Poole, P., Lamb, C. "Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,"International Conference on Parallel Processing (ICPP'11),
Taipei, Taiwan, September 13-16, 2011.
Moore,S., Ralph, J. "User-defined Events for Hardware Performance Monitoring,"ICCS 2011 Workshop: Tools for Program Development and Analysis in Computational Science,
www.sciencedirect.com,
Singapore, June 1, 2011.
Malony, A., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Poole, D., Lamb, C. "Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs,"ICPP 2011 (submitted),
Taipei, Taiwan, 2011.
Weaver, V., Dongarra, J. "Can Hardware Performance Counters Produce Expected, Deterministic Results?,"3rd Workshop on Functionality of Hardware Performance Monitoring,
Atlanta, GA, December 4, 2010.
Terpstra, D., Jagode, H., You, H., Dongarra, J. "Collecting Performance Data with PAPI-C,"Tools for High Performance Computing 2009,
Springer Berlin / Heidelberg,
3rd Parallel Tools Workshop, Dresden, Germany, pp. 157-173,
2009.
Moore, S., Cronk, D., Wolf, F., Purkayastha, A., Teller, P., Araiza, R., Aguilera, M., Nava, J. "Performance Profiling and Analysis of DoD Applications using PAPI and TAU,"Proceedings of DoD HPCMP UGC 2005,
IEEE,
Nashville, TN, June, 2005.
Andersson, U., Mucci, P. "Analysis and Optimization of Yee_Bench using Hardware Performance Counters,"Proceedings of Parallel Computing 2005 (ParCo) (to appear),
Malaga, Spain, September, 2005.
Mucci, P., Ahlin, D., Danielsson, J., Ekman, P., Malinowski, L. "PerfMiner: Cluster-Wide Collection, Storage and Presentation of Application Level Hardware Performance Data,"Proceedings of 2005 European Conference on Parallel Computers (Euro-Par) (to appear),
Monte de Caparica, Portugal, August/September, 2005.
Yi, Q., Kennedy, K., You, H., Seymour, K., Dongarra, J. "Automatic Blocking of QR and LU Factorizations for Locality,"2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004),
Washington, DC, June 8, 2004.
Mucci, P., Dongarra, J., Kufrin, R., Moore, S., Song, F., Wolf, F. "Automating the Large-Scale Collection and Analysis of Performance,"In Proceedings of the 5th LCI International Conference on Linux Clusters: The HPC Revolution,
Austin, Texas, May 18-20, 2004.