I'm using magma_cgetri_gpu() to obtain the inverse of a Kirchoff matrix (Laplacian matrix) obtained from a complex graph. This matrix has negative constants (-1) and zeros in the off-diagonal entries, and non-negative constants along the diagonal entries. Its determinant is different from zero. Specifically, it is a 11,174x11,174 square matrix with only 23,409 negative entries, i.e. it is a sparse matrix.
The problem is: when I run magma_cgetri_gpu() to invert my Kirchoff matrix, the code error is equal to zero but magma_cgetri_gpu() returns a zero matrix, i.e. a matrix where all entries are zero.
My first thought was that the matrix was so big for its storage in the device memory. However, TESTING_MALLOC, TESTING_HOSTALLOC, and TESTING_DEVALLOC produce no errors. My second thought was that the matrix had no inverse. However, its determinant is non zero. After I run the routine magma_cgetri_gpu(), the info variable where the error code is stored is equal to zero. My third thought was that the magma_cgetrf2_gpu() fails to factor the matrix or produces a zero matrix. However, it returns a non-zero matrix.
I was able to successfully invert smaller Kirchoff matrices using magma_cgetri_gpu(), with several hundreds of rows and columns. The problem appears when I try to invert bigger matrices, with thousands of rows and columns. Could the problem be memory-related?
I'm using an old version of Magma: 1.1.0, mostly because the server where I'm running my code has CUDA 3.2. The server runs CentOS x86_64, with GCC 4.1.2. The GPU device is a Tesla C2070 with compute capability 2.0 and with 6Gb GDDR5 of device memory.
Below is my code. Basically, I adapted the example "testing_cgetri_gpu.cpp" from the "testing" folder of the Magma 1.1.0 sources.
Thank you in advance for any help you can provide.
Albert.
- Code: Select all
TESTING_CUDA_INIT();
cuFloatComplex *h_A, *h_R;
cuFloatComplex *d_A, *dwork;
magma_int_t N = 0, n2, lda, ldda;
magma_int_t i, info;
cuFloatComplex *work;
cuFloatComplex tmp;
magma_int_t *ipiv;
magma_int_t lwork, ldwork;
/* query for Lapack workspace size */
N = (magma_int_t) igraph_vcount(g); // N is the number of rows of the Kirchoff matrix or, equivalently, the number of nodes of the graph
lda = N;
work = &tmp;
lwork = -1;
lapackf77_cgetri(&N, h_A, &lda, ipiv, work, &lwork, &info);
if (info != 0)
printf("An error occured in magma_cgetri, info=%d\n", info);
lwork = (int)MAGMA_C_REAL(*work);
/* query for Magma workspace size */
ldwork = N * magma_get_cgetri_nb(N);
/* Allocate memory */
n2 = N * N;
ldda = ((N + 31) / 32) * 32;
TESTING_MALLOC(ipiv, magma_int_t, N);
TESTING_MALLOC(work, cuFloatComplex, lwork);
TESTING_MALLOC(h_A, cuFloatComplex, n2);
TESTING_HOSTALLOC(h_R, cuFloatComplex, n2);
TESTING_DEVALLOC(d_A, cuFloatComplex, ldda * N);
TESTING_DEVALLOC(dwork, cuFloatComplex, ldwork);
N = (magma_int_t) igraph_vcount(g);
lda = N;
n2 = lda*N;
ldda = ((N + 31) / 32)*32;
/* Initialize the NxN Kirchoff matrix "matriz" from complex graph */
/* The NxN Kirchof matrix is in row-major order */
/* (code to initialize the matrix) */
// Convert the NxN Kirchoff matrix in a vector in column-major order
c = 0;
for (j = 0; j < N; j++)
for (i = 0; i < N; i++) {
h_A[c] = make_cuFloatComplex((float) matriz[i][j], 0);
c += 1;
}
/* Factor the matrix. */
cublasSetMatrix(N, N, sizeof (cuFloatComplex), h_A, lda, d_A, ldda);
magma_cgetrf2_gpu(N, N, d_A, ldda, ipiv, &info);
cublasGetMatrix(N, N, sizeof (cuFloatComplex), d_A, ldda, h_A, lda);
/* ====================================================================
Performs operation using MAGMA
=================================================================== */
/* Invert the matrix */
magma_cgetri_gpu(N, d_A, ldda, ipiv, dwork, ldwork, &info);
if (info != 0)
printf("An error occured in magma_cgetri, info=%d\n", info);
cublasGetMatrix(N, N, sizeof (cuFloatComplex), d_A, ldda, h_R, lda);
/* Use the inverted Kirchoff matrix */
/* (code to use the matrix) */
/* Memory clean up */
TESTING_FREE(ipiv);
TESTING_FREE(work);
TESTING_FREE(h_A);
TESTING_HOSTFREE(h_R);
TESTING_DEVFREE(d_A);
TESTING_DEVFREE(dwork);
igraph_matrix_destroy(&matriz);
igraph_vector_destroy(&grados);
/* Shutdown */
TESTING_CUDA_FINALIZE();
