Open discussion for MAGMA
I'm new to MAGMA and am curious why there are no packed storage options (for example, a dpptrf packed Cholesky factorization).
Is this just something yet to be implemented, or are algorithms for these structures are not well suited for GPUs?
The non-packed routines work great so far... I'm just easily hitting my 1GB GPU memory size.
Thanks for any insight!
- Posts: 1
- Joined: Tue Apr 23, 2013 1:07 am
Packed storage yields poor performance, even on CPUs, because the complicated indexing prevents efficient use of cache and memory bandwidth. Just compare the performance on the CPU:
- Code: Select all
./testing_spotrf -N 4000 -l -L
N CPU GFlop/s (sec) GPU GFlop/s (sec) ||R_magma - R_lapack||_F / ||R_lapack||_F
4000 26.55 ( 0.80) --- ( ---) --- # spotrf
4000 1.38 ( 15.47) --- ( ---) --- # spptrf (packed)
An alternative is RFP (rectangular full packed) format. This works better than packed storage on CPUs. We haven't explored it on GPUs. See:http://www.netlib.org/lapack/lawnspdf/lawn199.pdf
- Posts: 407
- Joined: Fri Jan 06, 2012 2:13 pm
Return to User discussion
Who is online
Users browsing this forum: Baidu [Spider], Bing [Bot], Google [Bot], Yahoo [Bot] and 4 guests