Publications
PublicationsOther Publications

   

Showing records 1 - 10 of 95

Masliah, I., Abdelfattah, A., Haidar, A., Tomov, S., Baboulin, M., Falcou, J., Dongarra, J. "High-performance matrix-matrix multiplications of very small matrices," 22nd International European Conference on Parallel and Distributed Computing (Euro-Par'16), Grenoble, France, August 22-26, 2016.

PDF
Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance, Design, and Autotuning of Batched GEMM for GPUs," The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 19-23, 2016.

PDF
Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs," International Conference on Computational Science (ICCS'16), San Diego, California, U.S.A., June 6-8, 2016.

PDF
Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures," The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, IEEE, Chicago, IL, USA, May 27, 2016.

PDF
Newburn, CJ., Bansal, G., Wood, M., Crivelli, L., Planas, J., Duran, A., Souza, P., Borges, L., Luszczek, P., Tomov, S., Dongarra, J., Anzt, H., Gates, M., Haidar, A., Jia, Y., Kabir, K., Yamazaki, I., Labarta, J. "Heterogeneous Streaming," The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, IEEE, Chicago, IL, USA, May 23, 2016.

PDF
Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance, Design, and Autotuning of Batched GEMM for GPUs," University of Tennessee Computer Science Technical Report, UT-EECS-16-739, February 1, 2016.

PDF
Abdelfattah, A., Baboulin, M., Dobrev, V., Dongarra, J., Earl, C., Falcou, J., Haidar, A., Karlin, I., Kolev, Tz., Masliah, I., Tomov, S. "High-Performance Tensor Contractions for GPUs," University of Tennessee Computer Science Technical Report, UT-EECS-16-738, January 21, 2016.

PDF
Anzt, H., Dongarra, J., Kreutzer, M., Wellein, G., Koehler, M. "Efficiency of general Krylov methods on GPUs – An experimental study," The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, 2016.

PDF
Haidar, A., Jia, Y., Luszczek, P., Tomov, S., YarKhan, A., Dongarra, J. "Weighted Dynamic Scheduling with Many Parallelism Grains for Offloading of Numerical Workloads to Multiple Varied Accelerators," Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'15), ACM, New York, NY, USA, No. 5, November 16, 2015.

PDF
Mary, T., Yamazaki, I., Kurzak, J., Luszczek, P., Tomov, S., Dongarra, J. "Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs," The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 15), Austin, TX, Nov. 15, 2015.


Showing records 1 - 10 of 95

Jun 27 2016 Admin Login