Can't get syevd_gpu to work

Open discussion for MAGMA

Can't get syevd_gpu to work

Postby ChrisM » Sun Dec 08, 2013 1:06 pm

As the title says, i'm having difficulties getting said function to work, specifically the double precision variant magma_dsyevd_gpu(...). Below is a minimal program that demonstrates my approach to this function, where generateSymmetricMatrixD(DIM) creates a column-major DIM-by-DIM symmetric matrix. Looking that the source code i discovered that for small matrices (<=128), magma delegates this function to CPU Lapack, and up to that size it works fine. However, for larger dimensions the program terminates during execution ("dsyevd_gpu call" gets printed, "dsyevd_gpu return" does not).

The complete code can be found here:
I'd be grateful for any hints regarding this problem.


Code: Select all
#define HAVE_CUBLAS true

... includes ...

... helper functions ...

int main(int argc, char **argv) {
   const unsigned DIM = 272;

   double *matrixH = generateSymmetricMatrixD(DIM);
   double *matrixD = NULL;
   cudaError_t err = cudaMalloc((void**) &matrixD, DIM * DIM * sizeof(double));
   checkCudaError(err, "POS1");
   err = cudaMemcpy(matrixD, matrixH, DIM * DIM * sizeof(double), cudaMemcpyHostToDevice);
   checkCudaError(err, "POS3");

   double *w = new double[DIM];
   int ldwa = DIM;
   double *wa = new double[ldwa];
   int lwork = max(2 * DIM + DIM * magma_get_dsytrd_nb(DIM), 1 + 6 * DIM + 2 * int(pow(DIM, 2)));
   double *work = new double[lwork];
   int liwork = 3 + 5 * DIM;
   int *iwork = new int[liwork];
   int *info = new int;

   cout << "dsyevd_gpu call" << endl;
   magma_dsyevd_gpu(MagmaVec, MagmaUpper, DIM, matrixD, DIM, w, wa, ldwa, work,
         lwork, iwork, liwork, info);
   cout << "dsyevd_gpu return" << endl;


Posts: 2
Joined: Sun Dec 08, 2013 12:26 pm

Re: Can't get syevd_gpu to work

Postby mgates3 » Wed Dec 11, 2013 1:13 pm

wa is an LDWA * N matrix. You only allocated LDWA space.

Code: Select all
    WA      (workspace) DOUBLE PRECISION array, dimension (LDWA, N)

    LDWA    (input) INTEGER
            The leading dimension of the array WA.  LDWA >= max(1,N).

double *wa = new double[ ldwa ];
double *wa = new double[ ldwa * DIM ];

Minor comments:
For improved GPU efficiency, I recommend using an LDDA (on the GPU) that is a multiple of 32. This is our common practice in all the magma testers. That is, set LDDA = roundup( DIM, 32 ) = (( DIM + 31 )/32)*32 and allocate an LDDA x DIM array. Set the top DIM rows of that array to your DIM x DIM matrix. The magma_dsetmatrix and magma_dgetmatrix routines (or cublas{Set, Get}Matrix, or cudaMemcpy2D) handle data transfers with an lda (also known as pitch). This allows all the columns to be aligned for better memory access on the GPU.

On the CPU, using LDA=N is usually fine. In fact, on the CPU one should avoid an LDA that is a multiple of the page size (often 4 KiB) to avoid TLB issues.

Not that it matters much, but using DIM*DIM is simpler and faster than int( pow( DIM, 2 )).

While there's nothing wrong with allocating an int using new for info, but we always just pass a pointer to an int as that's simpler, e.g.,
Code: Select all
    int info;
    magma_dsyevd_gpu( ..., &info );
    if ( info != 0 ) {
        fprintf( stderr, "error %d\n", info );
Posts: 388
Joined: Fri Jan 06, 2012 2:13 pm

Re: Can't get syevd_gpu to work

Postby ChrisM » Fri Dec 13, 2013 6:51 am

That solved my problem, thanks a lot. I was confused by dimension (LDWA, N) because there only exist 1D-Arrays on the GPU, and I decided to 'go with the bigger one, just in case'. Maybe the documentation could be updated to reflect that.
Also thanks for the performance hints.
Posts: 2
Joined: Sun Dec 08, 2013 12:26 pm

Return to User discussion

Who is online

Users browsing this forum: Google [Bot], ronmarc97 and 2 guests