packed storage

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

packed storage

Postby ryanjparker » Tue Apr 23, 2013 1:14 am

I'm new to MAGMA and am curious why there are no packed storage options (for example, a dpptrf packed Cholesky factorization).

Is this just something yet to be implemented, or are algorithms for these structures are not well suited for GPUs?

The non-packed routines work great so far... I'm just easily hitting my 1GB GPU memory size.

Thanks for any insight!
Posts: 1
Joined: Tue Apr 23, 2013 1:07 am

Re: packed storage

Postby mgates3 » Tue Apr 23, 2013 8:03 pm

Packed storage yields poor performance, even on CPUs, because the complicated indexing prevents efficient use of cache and memory bandwidth. Just compare the performance on the CPU:

Code: Select all
./testing_spotrf -N 4000 -l -L
  N     CPU GFlop/s (sec)   GPU GFlop/s (sec)   ||R_magma - R_lapack||_F / ||R_lapack||_F
 4000     26.55 (   0.80)       --- (  ---)  ---     # spotrf
 4000      1.38 (  15.47)       --- (  ---)  ---     # spptrf (packed)

An alternative is RFP (rectangular full packed) format. This works better than packed storage on CPUs. We haven't explored it on GPUs. See:
Posts: 700
Joined: Fri Jan 06, 2012 2:13 pm

Return to User discussion

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], Yahoo [Bot] and 6 guests