Hatem Ltaief

 
Alumni
I moved to KAUST Supercomputing Laboratory, Saudi Arabia where I am currently holding a Computational Scientist position.

PLASMA, FT-LA, MAGMA

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization," Concurrency and Computation: Practice and Experience, Wiley eds. Wiley, Vol. 26, No. 7, pp. 1408-1431, May, 2014 [pdf] [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures," ACM Transactions on Mathematical Software (TOMS), Vol. 39, No. 3, April, 2013 [pdf] [bibtex]

Dongarra, J., Ltaief, H., Luszczek, P., Weaver, V. "Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture," The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November, 2012 [pdf] [bibtex]

Agullo, E., Bosilca, G., Castagn├Ęde, C., Dongarra, J., Ltaief, H., Tomov, S. "Matrices Over Runtime Systems at Exascale," Supercomputing '12 (poster), Salt Lake City, Utah, November, 2012 [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction," Lecture Notes in Computer Science, Vol. 7203, pp. 661-670, September, 2012 [pdf] [bibtex]

Bosilca, G., Dongarra, J., Ltaief, H. "Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems," Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September, 2012 [pdf] [bibtex]

Abdelfattah, A., Dongarra, J., Keyes, D., Ltaief, H. "Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators," VECPAR 2012, Kobe, Japan, July, 2012 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices," SIAM Journal on Scientific Computing (Accepted), July, 2012 [bibtex]

Haidar, A., Ltaief, H., Luszczek, P., Dongarra, J. "A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction," IPDPS 2012, Shanghai, China, May, 2012 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Langou, J., Ltaief, H., Tomov, S. "LU Factorization for Accelerator-based Systems," IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December, 2011 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels," Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), Seattle, WA, November 14, 2011 [pdf] [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures," Proceedings of MTAGS11, Seattle, WA, November, 2011 [pdf] [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency," International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 7-9, 2011 [pdf] [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization," University of Tennessee Computer Science Technical Report (also as a LAWN), ICL-UT-11-08, September, 2011 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels," University of Tennessee Computer Science Technical Report, UT-CS-11-677, (also Lawn254), August 5, 2011 [pdf] [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures," University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 18, 2011 [pdf] [bibtex]

Luszczek, P., Ltaief, H., Dongarra, J. "Two-stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures," IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 16-20, 2011 [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Exploiting Fine-Grain Parallelism in Recursive LU Factorization," Proceedings of PARCO'11, Gent, Belgium, ICL-UT-11-04, April, 2011 [bibtex]

Haidar, A., Ltaief, H., YarKhan, A., Dongarra, J. "Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures," University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), March 10, 2011 [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices.," Submitted to SIAM Journal on Scientific Computing (SISC), 2011 [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S. "A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs," in GPU Computing Gems, Jade Edition, Hwu, W. eds. Elsevier, 2, 473-484, 2011 [bibtex]

Song, F., Ltaief, H., Hadri, B., Dongarra, J. "Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems," SC'10, ACM SIGARCH/ IEEE Computer Society, New Orleans, LA, November 13-19, 2010 [pdf] [bibtex]

Haidar, A., Ltaief, H., YarKhan, A., Dongarra, J. "Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures," Submitted to Concurrency and Computations: Practice and Experience, November 3, 2010 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Ltaief, H., Thibault, S., Tomov, S. "QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators," Proceedings of IPDPS 2011, Anchorage, AK, ICL-UT-10-04, October 1, 2010 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Danalis, A., Faverge, M., Haidar, H., Herault, T., Kurzak, J., Langou, J., Lemariner, P., Ltaief, H., Luszczek, P., YarKhan, A., Dongarra, J. "Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA," University of Tennessee Computer Science Technical Report, UT-CS-10-660, Sept. 15, 2010 [pdf] [bibtex]

Ltaief, H., Tomov, S., Nath, R., Du, P., Dongarra, J. "A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators," Proc. of VECPAR'10 (to appear), Berkeley, CA, June 22-25, 2010 [pdf] [bibtex]

Song, F., Ltaief, H., Hadri, B., Dongarra, J. "Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems," University of Tennessee Computer Science Technical Report, UT-CS-10-653, April, 2010 [pdf] [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J. "Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems, pp. 417-423, April, 2010 [pdf] [bibtex]

Ltaief, H., Tomov, S., Nath, R., Dongarra, J. "Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators," IEEE Transaction on Parallel and Distributed Systems (submitted), March 26, 2010 [pdf] [bibtex]

Tomov, S., Nath, R., Ltaief, H., Dongarra, J. "Dense Linear Algebra Solvers for Multicore with GPU Accelerators," Proc. of IPDPS'10, Atlanta, GA, January 15, 2010 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Dense Linear Algebra Operations on Multicore Processors," Concurrency and Computation: Practice and Experience, Vol. 22, no. 1, pp. 15-44, January, 2010 [pdf] [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J., M. Badia, R. "Scheduling Two-sided Transformations using Tile Algorithms on Multicore Architectures," Journal of Scientific Computing, Vol. 18, No. 1, pp. 33-50, 2010 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Danalis, A, Faverge, M., Haidar, A., Herault, T., Kurzak, J., Langou, J., Lemarinier, P., Ltaief, H., Luszczek, P., YarKhan, A., Dongarra, J. "Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project," Innovative Computing Laboratory Technical Report, ICL-UT-10-02, 2010 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., and Tomov, S. "Faster, Cheaper, Better - a Hybridization Methodology to Develop Linear Algebra Software for GPUs," LAPACK Working Note 230, 2010 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Enhancing Parallelism of Tile QR Factorization for Multicore Architectures," Submitted to Transaction on Parallel and Distributed Systems, December, 2009 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Tile QR Factorization with Parallel Panel Processing for Multicore Architectures," accepted in 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, GA, December, 2009 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures," Innovative Computing Laboratory Technical Report (also LAPACK Working Note 222 and CS Tech Report UT-CS-09-645), ICL-UT-09-03, September 4, 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Dependency-Driven Scheduling of Dense Matrix Factorizations on Shared-Memory Systems," PPAM 2009, Poland, September, 2009 [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J. "Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems (to appear), May, 2009 [pdf] [bibtex]

Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S. "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects," Journal of Physics: Conference Series, Vol. 180, 2009 [pdf] [bibtex]

Agullo, E., Hadri, B., Ltaief, H., Dongarra, J. "Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware," 2009 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09) (to appear), 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Linear Algebra Operations on Multicore Processors," University of Tennessee Computer Science Department Technical Report, UT-CS-09-636 (Also LAPACK Working Note 213), 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Linear Algebra Operations on Multicore Processors," Concurrency Practice and Experience (to appear), 2009 [bibtex]

Ltaief, H., Kurzak, J., Dongarra., J. "Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited," University of Tennessee Computer Science Technical Report, UT-CS-08-624 (also LAPACK Working Note 208), August 7, 2008 [pdf] [bibtex]

GPU Technology Conference (GTC 2010)
2010-09-20 San Jose, CA


VECPAR
2010-06-22 Berkeley, CA


CUDA Center of Excellence 2010
2010-06-12 Beijing, China


Hybrid Multicore Consortium, First Annual Workshop
2010-01-20 San Francisco, CA


Supercomputing 2009
2009-11-14 Portland, OR


Workshop on Resiliency for Petascale HPC
2009-10-13 Santa Fe, NM


Scientific Discovery through Advanced Computing MEETING
2009-06-14 San Diego, CA


Parallel and Computational Fluid Dynamics
2008-05-19 Lyon, France


Email
Phone 865-974-9985
Office Claxton

University of Tennessee
Computer Science Department
Innovative Computing Laboratory
1122 Volunteer Blvd, Claxton Building
Knoxville, Tennessee 37996-3450
Fax 865-974-8296