MAGMA  2.3.0
Matrix Algebra for GPU and Multicore Architectures
 All Classes Files Functions Friends Groups Pages
getmatrix_bcyclic: multi-GPU => CPU

Functions

void magma_cgetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloatComplex_const_ptr const *dA, magma_int_t ldda, magmaFloatComplex *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_cgetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloatComplex_const_ptr const *dA, magma_int_t ldda, magmaFloatComplex *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_dgetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaDouble_const_ptr const *dA, magma_int_t ldda, double *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_dgetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaDouble_const_ptr const *dA, magma_int_t ldda, double *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_sgetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloat_const_ptr const *dA, magma_int_t ldda, float *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_sgetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaFloat_const_ptr const *dA, magma_int_t ldda, float *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_zgetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaDoubleComplex_const_ptr const *dA, magma_int_t ldda, magmaDoubleComplex *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host. More...
 
void magma_zgetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, magmaDoubleComplex_const_ptr const *dA, magma_int_t ldda, magmaDoubleComplex *hA, magma_int_t lda, magma_queue_t queues[])
 Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host. More...
 

Detailed Description

Function Documentation

void magma_cgetmatrix_1D_col_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaFloatComplex_const_ptr const *  dA,
magma_int_t  ldda,
magmaFloatComplex *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU.
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= m.
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_cgetmatrix_1D_row_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaFloatComplex_const_ptr const *  dA,
magma_int_t  ldda,
magmaFloatComplex *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed. ngpu > 0.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n).
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_dgetmatrix_1D_col_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaDouble_const_ptr const *  dA,
magma_int_t  ldda,
double *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU.
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= m.
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_dgetmatrix_1D_row_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaDouble_const_ptr const *  dA,
magma_int_t  ldda,
double *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed. ngpu > 0.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n).
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_sgetmatrix_1D_col_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaFloat_const_ptr const *  dA,
magma_int_t  ldda,
float *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU.
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= m.
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_sgetmatrix_1D_row_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaFloat_const_ptr const *  dA,
magma_int_t  ldda,
float *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed. ngpu > 0.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n).
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_zgetmatrix_1D_col_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaDoubleComplex_const_ptr const *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D column block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU.
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= m.
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.
void magma_zgetmatrix_1D_row_bcyclic ( magma_int_t  ngpu,
magma_int_t  m,
magma_int_t  n,
magma_int_t  nb,
magmaDoubleComplex_const_ptr const *  dA,
magma_int_t  ldda,
magmaDoubleComplex *  hA,
magma_int_t  lda,
magma_queue_t  queues[] 
)

Copy matrix dA, which is distributed 1D row block cyclic over multiple GPUs, to hA on CPU host.

Parameters
[in]ngpuNumber of GPUs over which dAT is distributed. ngpu > 0.
[in]mNumber of rows of matrix hA. m >= 0.
[in]nNumber of columns of matrix hA. n >= 0.
[in]nbBlock size. nb > 0.
[in]dAArray of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n).
[in]lddaLeading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb
[out]hAThe m-by-n matrix A on the CPU, of dimension (lda,n).
[in]ldaLeading dimension of matrix hA. lda >= m.
[in]queuesArray of dimension (ngpu), with one queue per GPU.