I'm using magma_cgetri_gpu() to obtain the inverse of a Kirchoff matrix (Laplacian matrix) obtained from a complex graph. This matrix has negative constants (-1) and zeros in the off-diagonal entries, and non-negative constants along the diagonal entries. Its determinant is different from zero. Specifically, it is a 11,174x11,174 square matrix with only 23,409 negative entries, i.e. it is a sparse matrix.

The problem is: when I run magma_cgetri_gpu() to invert my Kirchoff matrix, the code error is equal to zero but magma_cgetri_gpu() returns a zero matrix, i.e. a matrix where all entries are zero.

My first thought was that the matrix was so big for its storage in the device memory. However, TESTING_MALLOC, TESTING_HOSTALLOC, and TESTING_DEVALLOC produce no errors. My second thought was that the matrix had no inverse. However, its determinant is non zero. After I run the routine magma_cgetri_gpu(), the info variable where the error code is stored is equal to zero. My third thought was that the magma_cgetrf2_gpu() fails to factor the matrix or produces a zero matrix. However, it returns a non-zero matrix.

I was able to successfully invert smaller Kirchoff matrices using magma_cgetri_gpu(), with several hundreds of rows and columns. The problem appears when I try to invert bigger matrices, with thousands of rows and columns. Could the problem be memory-related?

I'm using an old version of Magma: 1.1.0, mostly because the server where I'm running my code has CUDA 3.2. The server runs CentOS x86_64, with GCC 4.1.2. The GPU device is a Tesla C2070 with compute capability 2.0 and with 6Gb GDDR5 of device memory.

Below is my code. Basically, I adapted the example "testing_cgetri_gpu.cpp" from the "testing" folder of the Magma 1.1.0 sources.

Thank you in advance for any help you can provide.

Albert.

- Code: Select all
`TESTING_CUDA_INIT();`

cuFloatComplex *h_A, *h_R;

cuFloatComplex *d_A, *dwork;

magma_int_t N = 0, n2, lda, ldda;

magma_int_t i, info;

cuFloatComplex *work;

cuFloatComplex tmp;

magma_int_t *ipiv;

magma_int_t lwork, ldwork;

/* query for Lapack workspace size */

N = (magma_int_t) igraph_vcount(g); // N is the number of rows of the Kirchoff matrix or, equivalently, the number of nodes of the graph

lda = N;

work = &tmp;

lwork = -1;

lapackf77_cgetri(&N, h_A, &lda, ipiv, work, &lwork, &info);

if (info != 0)

printf("An error occured in magma_cgetri, info=%d\n", info);

lwork = (int)MAGMA_C_REAL(*work);

/* query for Magma workspace size */

ldwork = N * magma_get_cgetri_nb(N);

/* Allocate memory */

n2 = N * N;

ldda = ((N + 31) / 32) * 32;

TESTING_MALLOC(ipiv, magma_int_t, N);

TESTING_MALLOC(work, cuFloatComplex, lwork);

TESTING_MALLOC(h_A, cuFloatComplex, n2);

TESTING_HOSTALLOC(h_R, cuFloatComplex, n2);

TESTING_DEVALLOC(d_A, cuFloatComplex, ldda * N);

TESTING_DEVALLOC(dwork, cuFloatComplex, ldwork);

N = (magma_int_t) igraph_vcount(g);

lda = N;

n2 = lda*N;

ldda = ((N + 31) / 32)*32;

/* Initialize the NxN Kirchoff matrix "matriz" from complex graph */

/* The NxN Kirchof matrix is in row-major order */

/* (code to initialize the matrix) */

// Convert the NxN Kirchoff matrix in a vector in column-major order

c = 0;

for (j = 0; j < N; j++)

for (i = 0; i < N; i++) {

h_A[c] = make_cuFloatComplex((float) matriz[i][j], 0);

c += 1;

}

/* Factor the matrix. */

cublasSetMatrix(N, N, sizeof (cuFloatComplex), h_A, lda, d_A, ldda);

magma_cgetrf2_gpu(N, N, d_A, ldda, ipiv, &info);

cublasGetMatrix(N, N, sizeof (cuFloatComplex), d_A, ldda, h_A, lda);

/* ====================================================================

Performs operation using MAGMA

=================================================================== */

/* Invert the matrix */

magma_cgetri_gpu(N, d_A, ldda, ipiv, dwork, ldwork, &info);

if (info != 0)

printf("An error occured in magma_cgetri, info=%d\n", info);

cublasGetMatrix(N, N, sizeof (cuFloatComplex), d_A, ldda, h_R, lda);

/* Use the inverted Kirchoff matrix */

/* (code to use the matrix) */

/* Memory clean up */

TESTING_FREE(ipiv);

TESTING_FREE(work);

TESTING_FREE(h_A);

TESTING_HOSTFREE(h_R);

TESTING_DEVFREE(d_A);

TESTING_DEVFREE(dwork);

igraph_matrix_destroy(&matriz);

igraph_vector_destroy(&grados);

/* Shutdown */

TESTING_CUDA_FINALIZE();