Page 1 of 1

### Pinned memory for diagonalization dsygvd (Divide and conquer)

Posted: Tue Aug 20, 2019 1:20 pm
For solving a problem of the form: AX=B
There are two options:

magma_int_t magma_dgesv (magma_int_t n, magma_int_t nrhs, double *A, magma_int_t lda, magma_int_t *ipiv, double *B, magma_int_t ldb, magma_int_t *info)

magma_int_t magma_dgesv_gpu (magma_int_t n, magma_int_t nrhs, magmaDouble_ptr dA, magma_int_t ldda, magma_int_t *ipiv, magmaDouble_ptr dB, magma_int_t lddb, magma_int_t *info)

With the latter one (magma_dgesv_gpu), I can use pinned CPU memory and async copy.

I wanted to know if there is an analogous version for the subroutine: magma_dsygvdx_m. (magma_dsygvdx_m_gpu?)

Where the magma_dsygvdx_m is called this way:

magma_dsygvdx_m (magma_int_t ngpu, magma_int_t itype, magma_vec_t jobz, magma_range_t range, magma_uplo_t uplo, magma_int_t n, double *A, magma_int_t lda, double *B, magma_int_t ldb, double vl, double vu, magma_int_t il, magma_int_t iu, magma_int_t *m, double *w, double *work, magma_int_t lwork, magma_int_t *iwork, magma_int_t liwork, magma_int_t *info)

My ultimate purpose is to reduce data transfer and improve efficiency.

### Re: Pinned memory for diagonalization dsygvd (Divide and conquer)

Posted: Tue Aug 20, 2019 2:31 pm
Because of the complexity of managing an array distributed across multiple GPUs, we don't currently have a version of sygvdx_m where the matrix is given on the GPUs (i.e., sygvdx_mgpu).

I do recommend trying the 2-stage version, dsygvdx_2stage_m, which is often faster.

-mark