I have run the code that you can find in paragraph 4.7.13 of the following manual:
https://developer.nvidia.com/sites/defa ... /mygpu.pdf
I just erased the lapack parts.
I launched it 5 times just to understand the order of magnitude of the elapsed time, and I get a range between 56.99 and 93.80 seconds.
I am using magma 2.3.0, on a 2*8-cores Intel Xeon E5-2630 v3 @ 2.40GHz + 2 nVidia K80 GPUs, cuda version 9.0.
If I generate a symmetric random matrix of the same size in matlab and I compute its eigenvalues and eigenvectors on my dual-core macbook air with 1,6 GHz Intel Core i5, it takes a couple of minutes.
I was expecting magma to be way faster. What function can I use instead of magma_ssyevd_gpu?
Thank you in advance
What size is your problem?
Are you timing just the magma_ssyevd_gpu call, or are you timing the runtime of the entire test program?
You can run MAGMA's testers in magma/testing to get both MAGMA and LAPACK timings. There are several versions to try (note these are single precision; use dysevd for double precision):
Code: Select all
./testing_ssyevd -n 2000:20000:2000 -JV --lapack --niter 5 ./testing_ssyevd_gpu -n 2000:20000:2000 -JV --lapack --niter 5 ./testing_ssyevdx_2stage -n 2000:20000:2000 -JV --lapack --niter 5
Here are some results for comparison, using Intel MKL for BLAS & LAPACK.
Code: Select all
[mgates@b00 testing]$ ./testing_ssyevd -n 2000:20000:2000 -JV -l % MAGMA 2.3.0 svn compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer. % CUDA runtime 9020, driver 10000. OpenMP threads 20. MKL 2018.0.1, MKL threads 20. % device 0: Tesla K40c, 745.0 MHz clock, 11441.2 MiB memory, capability 3.5 % jobz = Vectors needed, uplo = Lower, ngpu = 1 % N CPU Time (sec) GPU Time (sec) |S-S_magma| |A-USU^H| |I-U^H U| %============================================================================ 2000 0.2239 0.2327 4.88e-10 --- --- ok 4000 0.8543 1.3268 9.15e-11 --- --- ok 6000 4.2844 3.0909 2.71e-11 --- --- ok 8000 14.1701 5.8308 6.10e-11 --- --- ok 10000 30.9939 11.7388 4.88e-11 --- --- ok