Newburn, CJ., Bansal, G., Wood, M., Crivelli, L., Planas, J., Duran, A., Souza, P., Borges, L., Luszczek, P., Tomov, S., Dongarra, J., Anzt, H., Gates, M., Haidar, A., Jia, Y., Kabir, K., Yamazaki, I., Labarta, J. "Heterogeneous Streaming,"The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, IEEE,
Chicago, IL, USA, May 23, 2016.
Lacoste, X., Faverge, M., Ramet, P., Thibault, S., Bosilca, G. "Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes,"23rd International Heterogeneity in Computing Workshop, IPDPS 2014,
IEEE,
Phoenix, AZ, May, 2014.
Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley R. Lowery, Yves Robert and Jack Dongarra "Designing LU-QR hybrid solvers for performance and stability,"IPDPS'2014 (extended version available as ArXiv 1401.5522),
IEEE Computer Science Press,
Phoenix, AZ, 2014.
Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization,"Concurrency and Computation: Practice and Experience,
Wiley eds.
Wiley,
Vol. 26, No. 7,
pp. 1408-1431,
May, 2014.
Simplice Donfack, Jack Dongarra, Mathieu Faverge, Mark Gates, Jakub Kurzak, Piotr Luszczek, and Ichitaro Yamazaki "On Algorithmic Variants of Parallel Gaussian Elimination: Comparison of Implementations in Terms of Performance and Numerical Properties,"University of Tennessee Computer Science Technical Report (also LAWN 280),
ut-cs-13-715,
July, 2013.
Donfack, S, Dongarra, J., Faverge, M., Gates, M., Kurzak, J., Luszczek, P., Yamazaki, I. "A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination,"Concurrency and Computation: Practice and Experience,
Wiley eds.
June 2, 2014.
Aupy, G., Faverge, M., Robert, Y., Kurzak, J., Luszczek, P., Dongarra, J. "Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC,"University of Tennessee Computer Science Technical Report, UT-CS-13-709 (Lawn 277),
May, 2013.
Kurzak, J., Luszczek, P., Faverge, M., Dongarra, J. "Programming the LU Factorization for a Multicore System with Accelerators,"Proceedings of VECPAR'12,
Kobe, Japan, April, 2012.
Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization,"University of Tennessee Computer Science Technical Report (also as a LAWN),
ICL-UT-11-08,
September, 2011.
Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S. "A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs,"in GPU Computing Gems, Jade Edition,
Hwu, W. eds.
Elsevier,
2,
473-484,
2011.
Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Ltaief, H., Thibault, S., Tomov, S. "QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators,"Proceedings of IPDPS 2011,
Anchorage, AK, ICL-UT-10-04,
October 1, 2010.
Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., and Tomov, S. "Faster, Cheaper, Better - a Hybridization Methodology to Develop Linear Algebra Software for GPUs,"LAPACK Working Note 230,
2010.