Hatem Ltaief

 
Alumni
I moved to KAUST Supercomputing Laboratory, Saudi Arabia where I am currently holding a Computational Scientist position.

PLASMA, FT-LA, MAGMA

Dongarra, J., Ltaief, H., Luszczek, P., Weaver, V. "Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture," The 2nd International Conference on Cloud and Green Computing (submitted), Xiangtan, Hunan, China, November, 2012 [pdf] [bibtex]

Agullo, E., Bosilca, G., Castagnède, C., Dongarra, J., Ltaief, H., Tomov, S. "Matrices Over Runtime Systems at Exascale," Supercomputing '12 (poster), Salt Lake City, Utah, November, 2012 [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction," Lecture Notes in Computer Science, Vol. 7203, pp. 661-670, September, 2012 [pdf] [bibtex]

Bosilca, G., Dongarra, J., Ltaief, H. "Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems," Third International Conference on Energy-Aware High Performance Computing, Hamburg, Germany, September, 2012 [pdf] [bibtex]

Abdelfattah, A., Dongarra, J., Keyes, D., Ltaief, H. "Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators," VECPAR 2012, Kobe, Japan, July, 2012 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices," SIAM Journal on Scientific Computing (Accepted), July, 2012 [bibtex]

Haidar, A., Ltaief, H., Luszczek, P., Dongarra, J. "A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction," IPDPS 2012, Shanghai, China, May, 2012 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Langou, J., Ltaief, H., Tomov, S. "LU Factorization for Accelerator-based Systems," IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December, 2011 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels," Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), Seattle, WA, November 14, 2011 [pdf] [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures," Proceedings of MTAGS11, Seattle, WA, November, 2011 [pdf] [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency," International Conference on Energy-Aware High Performance Computing (EnA-HPC 2011), Hamburg, Germany, September 7-9, 2011 [pdf] [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization," University of Tennessee Computer Science Technical Report (also as a LAWN), ICL-UT-11-08, September, 2011 [pdf] [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels," University of Tennessee Computer Science Technical Report, UT-CS-11-677, (also Lawn254), August 5, 2011 [pdf] [bibtex]

Ltaief, H., Luszczek, P., Dongarra, J. "High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures," University of Tennessee Computer Science Technical Report, UT-CS-11-673, (also Lawn 247), May 18, 2011 [pdf] [bibtex]

Luszczek, P., Ltaief, H., Dongarra, J. "Two-stage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures," IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 16-20, 2011 [bibtex]

Dongarra, J., Faverge, M., Ltaief, H., Luszczek, P. "Exploiting Fine-Grain Parallelism in Recursive LU Factorization," Proceedings of PARCO'11, Gent, Belgium, ICL-UT-11-04, April, 2011 [bibtex]

Haidar, A., Ltaief, H., YarKhan, A., Dongarra, J. "Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures," University of Tennessee Computer Science Technical Report, UT-CS-11-666, (also Lawn 243), March 10, 2011 [bibtex]

Haidar, A., Ltaief, H., Dongarra, J. "Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices.," Submitted to SIAM Journal on Scientific Computing (SISC), 2011 [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S. "A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs," in GPU Computing Gems, Jade Edition, Hwu, W. eds. Elsevier, 2, 473-484, 2011 [bibtex]

Song, F., Ltaief, H., Hadri, B., Dongarra, J. "Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems," SC'10, ACM SIGARCH/ IEEE Computer Society, New Orleans, LA, November 13-19, 2010 [pdf] [bibtex]

Haidar, A., Ltaief, H., YarKhan, A., Dongarra, J. "Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures," Submitted to Concurrency and Computations: Practice and Experience, November 3, 2010 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Ltaief, H., Thibault, S., Tomov, S. "QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators," Proceedings of IPDPS 2011, Anchorage, AK, ICL-UT-10-04, October 1, 2010 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Danalis, A., Faverge, M., Haidar, H., Herault, T., Kurzak, J., Langou, J., Lemariner, P., Ltaief, H., Luszczek, P., YarKhan, A., Dongarra, J. "Distributed Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA," University of Tennessee Computer Science Technical Report, UT-CS-10-660, Sept. 15, 2010 [pdf] [bibtex]

Ltaief, H., Tomov, S., Nath, R., Du, P., Dongarra, J. "A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators," Proc. of VECPAR'10 (to appear), Berkeley, CA, June 22-25, 2010 [pdf] [bibtex]

Song, F., Ltaief, H., Hadri, B., Dongarra, J. "Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems," University of Tennessee Computer Science Technical Report, UT-CS-10-653, April, 2010 [pdf] [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J. "Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems, pp. 417-423, April, 2010 [pdf] [bibtex]

Ltaief, H., Tomov, S., Nath, R., Dongarra, J. "Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators," IEEE Transaction on Parallel and Distributed Systems (submitted), March 26, 2010 [pdf] [bibtex]

Tomov, S., Nath, R., Ltaief, H., Dongarra, J. "Dense Linear Algebra Solvers for Multicore with GPU Accelerators," Proc. of IPDPS'10, Atlanta, GA, January 15, 2010 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Dense Linear Algebra Operations on Multicore Processors," Concurrency and Computation: Practice and Experience, Vol. 22, no. 1, pp. 15-44, January, 2010 [pdf] [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J., M. Badia, R. "Scheduling Two-sided Transformations using Tile Algorithms on Multicore Architectures," Journal of Scientific Computing, Vol. 18, No. 1, pp. 33-50, 2010 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Danalis, A, Faverge, M., Haidar, A., Herault, T., Kurzak, J., Langou, J., Lemarinier, P., Ltaief, H., Luszczek, P., YarKhan, A., Dongarra, J. "Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project," Innovative Computing Laboratory Technical Report, ICL-UT-10-02, 2010 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., and Tomov, S. "Faster, Cheaper, Better - a Hybridization Methodology to Develop Linear Algebra Software for GPUs," LAPACK Working Note 230, 2010 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Enhancing Parallelism of Tile QR Factorization for Multicore Architectures," Submitted to Transaction on Parallel and Distributed Systems, December, 2009 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Tile QR Factorization with Parallel Panel Processing for Multicore Architectures," accepted in 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2010), Atlanta, GA, December, 2009 [pdf] [bibtex]

Hadri, B., Ltaief, H., Agullo, E., Dongarra, J. "Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures," Innovative Computing Laboratory Technical Report (also LAPACK Working Note 222 and CS Tech Report UT-CS-09-645), ICL-UT-09-03, September 4, 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Dependency-Driven Scheduling of Dense Matrix Factorizations on Shared-Memory Systems," PPAM 2009, Poland, September, 2009 [bibtex]

Ltaief, H., Kurzak, J., Dongarra, J. "Parallel Band Two-Sided Matrix Bidiagonalization for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems (to appear), May, 2009 [pdf] [bibtex]

Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S. "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects," Journal of Physics: Conference Series, Vol. 180, 2009 [pdf] [bibtex]

Agullo, E., Hadri, B., Ltaief, H., Dongarra, J. "Comparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware," 2009 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '09) (to appear), 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Linear Algebra Operations on Multicore Processors," University of Tennessee Computer Science Department Technical Report, UT-CS-09-636 (Also LAPACK Working Note 213), 2009 [pdf] [bibtex]

Kurzak, J., Ltaief, H., Dongarra, J., Badia, R. "Scheduling Linear Algebra Operations on Multicore Processors," Concurrency Practice and Experience (to appear), 2009 [bibtex]

Ltaief, H., Kurzak, J., Dongarra., J. "Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited," University of Tennessee Computer Science Technical Report, UT-CS-08-624 (also LAPACK Working Note 208), August 7, 2008 [pdf] [bibtex]

GPU Technology Conference (GTC 2010)
2010-09-20 San Jose, CA


VECPAR
2010-06-22 Berkeley, CA


CUDA Center of Excellence 2010
2010-06-12 Beijing, China


Hybrid Multicore Consortium, First Annual Workshop
2010-01-20 San Francisco, CA


Supercomputing 2009
2009-11-14 Portland, OR


Workshop on Resiliency for Petascale HPC
2009-10-13 Santa Fe, NM


Scientific Discovery through Advanced Computing MEETING
2009-06-14 San Diego, CA


Parallel and Computational Fluid Dynamics
2008-05-19 Lyon, France


Email
Phone 865-974-9985
Office Claxton

University of Tennessee
Computer Science Department
Innovative Computing Laboratory
1122 Volunteer Blvd, Claxton Building
Knoxville, Tennessee 37996-3450
Fax 865-974-8296