I'm using the latest MAGMA 1.4.1 and I noticed that for all examples in testing_dgetrf_gpu M is multiple of 32 i.e. the matrix size is set to LDDAxN where LDDA is ((M + 31) / 32)*32. When I try using the function from my application with M sizes that are not multiple of 32 it crashes (no error message just segmentation fault). I here set ldda simply to M. Is this a known limitation of dgetrf? I reviewed the function documentation and could not find anywhere mentioning such restriction.

My wrapper function looks like:

- Code: Select all
`/// Computes and returns the LU decomposition of d_A`

/**

* Computes and returns the LU decomposition of d_A

*/

inline void deviceDgetrf(double* d_A, size_t m, size_t n) {

magma_int_t info;

magma_int_t *ipiv;

size_t min_mn = (m < n) ? m : n;

magma_malloc_cpu((void**) &ipiv, min_mn);

magma_int_t ldda = m;

MAGMA_ERROR_CHECK("magma_dgetrf_gpu", magma_dgetrf_gpu(m, n, d_A, ldda, ipiv, &info));

magma_free_cpu(ipiv);

}

TIA,

Best regards,

Giovanni