I found out that the spotrf_batched routine fails for batch size > 524280 and dpotrf_batched routine fails for batch size > 262140. I confirmed this issue using the test suite and the following commands:
Code: Select all
./testing_spotrf_batched --batch 524281 -n 2 --check --matrix rand_dominant
Code: Select all
./testing_dpotrf_batched --batch 262141 -n 2 --check --matrix rand_dominant
Code: Select all
% MAGMA 2.5.0 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 10000, driver 10010. OpenMP threads 4.
% device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0
% Sun Jul 14 09:15:41 2019
% Usage: ./testing_spotrf_batched [options] [-h|--help]
% BatchCount N CPU Gflop/s (ms) GPU Gflop/s (ms) ||R_magma - R_lapack||_F / ||R_lapack||_F
%===================================================================================================
524281 2 0.01 ( 207.07) 119.51 ( 0.02) 7.76e-01 failed
Code: Select all
% MAGMA 2.5.0 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 10000, driver 10010. OpenMP threads 4.
% device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0
% Sun Jul 14 09:15:27 2019
% Usage: ./testing_spotrf_batched [options] [-h|--help]
% BatchCount N CPU Gflop/s (ms) GPU Gflop/s (ms) ||R_magma - R_lapack||_F / ||R_lapack||_F
%===================================================================================================
524280 2 0.01 ( 207.92) 0.42 ( 6.24) 1.49e-07 ok
Code: Select all
% MAGMA 2.5.0 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 10000, driver 10010. OpenMP threads 4.
% device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0
% Sun Jul 14 09:25:34 2019
% Usage: ./testing_dpotrf_batched [options] [-h|--help]
% BatchCount N CPU Gflop/s (ms) GPU Gflop/s (ms) ||R_magma - R_lapack||_F / ||R_lapack||_F
%===================================================================================================
262141 2 0.01 ( 98.74) 65.45 ( 0.02) 7.76e-01 failed
Code: Select all
% MAGMA 2.5.0 svn compiled for CUDA capability >= 5.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 10000, driver 10010. OpenMP threads 4.
% device 0: GeForce 940M, 1176.0 MHz clock, 2004.5 MiB memory, capability 5.0
% Sun Jul 14 09:25:28 2019
% Usage: ./testing_dpotrf_batched [options] [-h|--help]
% BatchCount N CPU Gflop/s (ms) GPU Gflop/s (ms) ||R_magma - R_lapack||_F / ||R_lapack||_F
%===================================================================================================
262140 2 0.01 ( 103.36) 0.13 ( 9.93) 2.71e-16 ok