spotrf(...) calls LAPACK, on the CPU.
magmaf_spotrf(...) calls MAGMA, which will use both the CPU and GPU.
A moderately large matrix, say 2000x2000, is required before seeing the benefit of using the GPU. For small matrices, on the order of 100x100, the overhead associated with copying to-and-from the GPU outweighs any benefit.
If you successfully linked with MAGMA, then you linked with some version of LAPACK, because MAGMA depends on LAPACK. This may the LAPACK provided in Intel MKL, ACML, or Mac OS veclib, if you use one of those.
For spotri, the first question to ask is, do you really want to invert a matrix? An inverse is rarely needed. It is faster, more accurate, and uses less memory to use spotrs or sposv to do a solve, than to explicitly invert and then multiply. However, if you really do need the inverse, for magmaf_spotri(...), there is currently the C interface, magma_spotri, but not the Fortran interface, magmaf_spotri. This was just an oversight; we will add it in the next release. Adding the interface is fairly simple. Just add these functions:
- Code: Select all
// add to magma/control/magma_sf77.cpp
#define MAGMAF_SPOTRI MAGMA_FORTRAN_NAME(spotri, SPOTRI )
void MAGMAF_SPOTRI( char *uplo, magma_int_t *n, float *A,
magma_int_t *lda, magma_int_t *info)
{
magma_spotri( uplo[0], *n, A, *lda, info);
}
// add to magma/control/magma_df77.cpp
#define MAGMAF_DPOTRI MAGMA_FORTRAN_NAME(dpotri, DPOTRI )
void MAGMAF_DPOTRI( char *uplo, magma_int_t *n, double *A,
magma_int_t *lda, magma_int_t *info)
{
magma_dpotri( uplo[0], *n, A, *lda, info);
}
// add to magma/control/magma_zf77.cpp
#define MAGMAF_ZPOTRI MAGMA_FORTRAN_NAME(zpotri, ZPOTRI )
void MAGMAF_ZPOTRI( char *uplo, magma_int_t *n, cuDoubleComplex *A,
magma_int_t *lda, magma_int_t *info)
{
magma_zpotri( uplo[0], *n, A, *lda, info);
}
// add to magma/control/magma_cf77.cpp
#define MAGMAF_CPOTRI MAGMA_FORTRAN_NAME(cpotri, CPOTRI )
void MAGMAF_CPOTRI( char *uplo, magma_int_t *n, cuFloatComplex *A,
magma_int_t *lda, magma_int_t *info)
{
magma_cpotri( uplo[0], *n, A, *lda, info);
}
Let us know if you have any difficulties with those.
-mark