Publications



2013

Ichitaro Yamazaki, Tingxing Dong, Stanimire Tomov, and Jack Dongarra "Tridiagonalization of a symmetric dense matrix on a GPU cluster," in the proceedings of the third international workshop on accelerators and hybrid exascale systems (AsHES), May 20, 2013. BIBTEX
Weaver, V., Terpstra, D., Moore, S. "Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations," 2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, April 21-23, 2013. PDF BIBTEX
Weaver, V., Terpstra, D., McCraw, H., Johnson, M., Kasichayanula, K., Ralph, J., Nelson, J., Mucci, P., Mohan, T., Moore, S. "PAPI 5: Measuring Power, Energy, and the Cloud," Poster Abstract, 2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, April 21-23, 2013. PDF BIBTEX
Chongxiao, C., Dongarra, J., Du, P., Gates, M., Luszczek, P., Tomov, S. "clMAGMA: High Performance Dense Linear Algebra with OpenCL," University of Tennessee Computer Science Technical Report (Lawn 275), UT-CS-13-706, March, 2013. PDF BIBTEX
Bouteiller, A., Cappello, F., Dongarra, J., Guermouche, A., Herault, T., and Robert, Y. "Multi-criteria checkpointing strategies: optimizing response-time versus resource utilization," University of Tennessee Computer Science Technical Report, ICL-UT-13-01, February 15, 2013. PDF BIBTEX
Baboulin, M., Dongarra, J., Herrmann, J., Tomov, S. "Accelerating linear system solutions using randomization techniques," ACM Transactions on Mathematical Software (TOMS), Vol. 39, No 2, February, 2013. BIBTEX
Dongarra, J., Herault, T., Robert, Y. "Revisiting the Double Checkpointing Algorithm," University of Tennessee Computer Science Technical Report (LAWN 274), ut-cs-13-705, January 3, 2013. PDF BIBTEX
Ma, T., Bosilca, G., Bouteiller, A., Dongarra, J. "Kernel-assisted and topology-aware MPI collective communications on multi-core/many-core platforms," Journal of Parallel and Distributed Computing, accepted, January, 2013. PDF BIBTEX
Yamazaki, I., Becker, D., Dongarra, J., Druinsky, A., Peled, I., Toledo, S., Ballard, G., Demmel, J., Schwartz, O. "Implementing a Blocked Aasen’s Algorithm with a Dynamic Scheduler on Multicore Architectures," IPDPS 2013 (submitted), Boston, MA, October, 2012. PDF BIBTEX
Kurzak, J., Luszczek, P., YarKhan, A., Faverge, M., Langou, J., Bouwmeester, H., Dongarra, J. "Multithreading in the PLASMA Library," Multi and Many-Core Processing: Architecture, Programming, Algorithms, & Applications, Ahmed, M., Ammar, R., Rajasekaran, S. eds. Taylor & Francis, 2013. PDF BIBTEX