Why does my MEXed magma_zgetri code crash?

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
avgn
Posts: 7
Joined: Sun Jul 29, 2018 10:03 pm

Why does my MEXed magma_zgetri code crash?

Post by avgn » Mon Aug 13, 2018 9:49 pm

I have followed the documentation closely and am stumped on why my MEXed C code calling magma doesn't work...

It is a simple code that:

i. gets a matrix from matlab
ii. converts from the matlab ordering to magmaDoubleComplex
iii. transfers the data to the GPU
iv. calls Magma to find the inverse
v. transfers the data back to the Host
vi converts from magmaDoubleComplex to matlab ordering.

I have posted it below. The code runs fine if the calls to magma_zgetrf and magma_zgetri are commented out. This shows that the matrix is successfully being transfer to and from the GPU. But when I uncomment those lines the whole thing crashes.

To test the code I compiled with: mex CC=gcc CFLAGS="\$CFLAGS -DADD_" LDFLAGS="-lmagma -lcudart -lcublas" magmaZinv.c

Then from matlab, I ran:

a=rand(3)+rand(3)*1i;
magmaZinv(a)

Code: Select all

    #include <mex.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <math.h>
    #include <stddef.h>
    #include <magma_v2.h>
    #include "magma_lapack.h"
    #include <cuda_runtime.h>
    
    void mat2magma(magmaDoubleComplex* p, double* pr, double* pi,int numElements)
    {
        int j=0;
        for(j=0;j<numElements;j++){
            p[j].x=pr[j];
            p[j].y=pi[j];
        }
    }
    
    void magma2mat(magmaDoubleComplex* p, double* pr, double* pi,int numElements)
    {
        int j=0;
        for(j=0;j<numElements;j++){
            pr[j]= p[j].x;
            pi[j]= p[j].y;
        }
    }
    
    /*gateway function*/
    void mexFunction( int nlhs, mxArray *plhs[],
            int nrhs, const mxArray *prhs[]) {
        
        /*initialize magma*/
        magma_init();
        magma_queue_t queue = NULL;
        magma_device_t dev;
        magma_getdevice(&dev);
        magma_queue_create(dev,&queue );
        
        magma_int_t m,ldwork,info;
        magma_int_t *piv;
        magmaDoubleComplex *a,*da,*dwork;
        
        /* Matlab -> Host */
        m=mxGetM(prhs[0]);
        piv=(magma_int_t*) malloc(m*sizeof(magma_int_t));
        magma_zmalloc_cpu(&a,m*m);
        mat2magma(a,mxGetPr(prhs[0]),mxGetPi(prhs[0]),m*m);
        ldwork = m*magma_get_zgetri_nb(m);
        
        /* Host -> GPU */
        magma_zmalloc(&dwork,ldwork);
        magma_zmalloc(&da,m*m);
        magma_zsetmatrix(m,m,a,m,da,m,queue);
        
        /*LU and Inverse */
        magma_zgetrf_gpu(m,m,da,m,piv,&info);
        magma_zgetri_gpu(m,da,m,piv,dwork,ldwork,&info);
         
        /*GPU -> Host */
        magma_zgetmatrix(m,m,da,m,a,m,queue);
        
        /*Host -> Matlab*/
        plhs[0] = mxCreateDoubleMatrix(m,m,mxCOMPLEX);
        magma2mat(a,mxGetPr(plhs[0]),mxGetPi(plhs[0]),m*m);
        free(a);
        free(piv);
        magma_free(dwork);
        magma_free(da);
        magma_queue_destroy(queue);
        magma_finalize();
    }
Last edited by mgates3 on Mon Aug 13, 2018 10:29 pm, edited 1 time in total.
Reason: Use [code] ... [/code] to preserve whitespace

mgates3
Posts: 893
Joined: Fri Jan 06, 2012 2:13 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by mgates3 » Mon Aug 13, 2018 10:54 pm

Not sure what the problem is. Matlab uses MAGMA via the parallel computing toolbox, so if you have that there's no need to write mex files.

What size is magma_int_t set to when you compiled MAGMA? Matlab links with the ILP64 variant of Intel MKL, meaning magma_int_t needs to be int64. See the make.inc.mkl-icc-ilp64, which adds various ilp64 flags.

A couple tips (probably unrelated to the crashing):
- You should check info after the magma_zgetrf_gpu and magma_zgetri_gpu calls.
- I recommend keeping lda separate from m. It often improves performance to round it up to something like a multiple of 32 or a multiple of 32 + 8.
- You shouldn't need the magma_lapack.h header, nor the -DADD_ flag that it requires, since you aren't calling any blasf77_* or lapackf77_* functions.

-mark

avgn
Posts: 7
Joined: Sun Jul 29, 2018 10:03 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by avgn » Tue Aug 14, 2018 7:35 pm

Thanks Mark, I've confirmed that other magma functions like magma_zgemm do work as expected. Regarding your comment about magma_int_t. I've checked that both magma_int_t and int (inside the mex file) are 4 bytes. When I add the -DMKL_ILP64, magma_int_t becomes 8 bytes. However, both cases crash.

avgn
Posts: 7
Joined: Sun Jul 29, 2018 10:03 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by avgn » Tue Aug 14, 2018 9:53 pm

Also, I've confirmed that my standalone C code works

gcc -lmagma -lcudart Ccode.c -o Ccode.o

Code: Select all

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stddef.h>
#include <magma_v2.h>
#include <cuda_runtime.h>
#include <sys/time.h>
#include <time.h>

int main() {
    
    /*initialize magma*/
    magma_init();
    magma_queue_t queue = NULL;
    magma_device_t dev;
    magma_getdevice(&dev);
    magma_queue_create(dev,&queue );
    
    int m,ldwork,info;
    int *piv;
    magmaDoubleComplex *a,*da,*dwork;
    
    /* allocate and initialize a = magic(3)+magic(3)*1i; */
    m=3;
    piv=(int*) malloc(m*sizeof(int));
    ldwork = m*magma_get_zgetri_nb(m);
    magma_zmalloc_cpu(&a,m*m);
    a[0].x=8;a[0].y=8;
    a[1].x=3;a[1].y=3;
    a[2].x=4;a[2].y=4;
    a[3].x=1;a[3].y=1;
    a[4].x=5;a[4].y=5;
    a[5].x=9;a[5].y=9;
    a[6].x=6;a[6].y=6;
    a[7].x=7;a[7].y=7;
    a[8].x=2;a[8].y=2;
    
    /* Host -> GPU */
    magma_zmalloc(&dwork,ldwork);
    magma_zmalloc(&da,m*m);
    magma_zsetmatrix(m,m,a,m,da,m,queue);
    
    /*LU and Inverse */
    magma_zgetrf_gpu(m,m,da,m,piv,&info);
    magma_zgetri_gpu(m,da,m,piv,dwork,ldwork,&info);
    
    /*GPU -> Host */
    magma_zgetmatrix(m,m,da,m,a,m,queue);
    
    /* display inv(a) */
    for (int i=0;i<(m*m);i++){
        printf("%f +%fi\n",a[i].x,a[i].y);
    }
    
    /* free memory */
    free(a);
    free(piv);
    magma_free(dwork);
    magma_free(da);
    magma_queue_destroy(queue);
    magma_finalize();
    
    return 0;
}
Meanwhile, the MEXed code crashes, regardless of using int or magma_int_t or whether -DMKL_ILP64 is specified.

mgates3
Posts: 893
Joined: Fri Jan 06, 2012 2:13 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by mgates3 » Wed Aug 15, 2018 12:02 am

Yes, with MKL_ilp64, magma_int_t should be 8 bytes. You recompiled both magma and your mex code with that setting? It’s what Matlab uses.

BTW, is there a particular reason you’re computing an inverse? To solve AX=B, or equivalently, X=A^{-1} B, It’s both faster and more accurate to use gesv, or equivalently, getrf & getrs.

-mark

mgates3
Posts: 893
Joined: Fri Jan 06, 2012 2:13 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by mgates3 » Wed Aug 15, 2018 12:08 am

Also, if you use magma_*malloc_cpu, you should use magma_free_cpu, not just free. Particularly on Windows it matters. There’s also magma_imalloc_cpu for magma_int_t. But plain malloc & free are fine, or new & delete, are fine too.

avgn
Posts: 7
Joined: Sun Jul 29, 2018 10:03 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by avgn » Sat Sep 08, 2018 10:08 pm

My sys admin has figured out why this code crashes. I'll post this here for anyone using Magma from Matlab.

I was using a version of gcc version that was incompatible with my version of Matlab. Even though Matlab will compile the code giving only a warning, using the incompatible version of gcc will cause the code to segfault when it runs. Downgrading to a compatible version of gcc as well as using the MKL_ilp64 flag fixed the problem. Thanks!

mgates3
Posts: 893
Joined: Fri Jan 06, 2012 2:13 pm

Re: Why does my MEXed magma_zgetri code crash?

Post by mgates3 » Mon Sep 10, 2018 12:03 pm

Thanks for the update.
-mark

Post Reply