MAGMA  2.3.0 Matrix Algebra for GPU and Multicore Architectures
setmatrix_bcyclic: CPU => multi-GPU

## Functions

void magma_csetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaFloatComplex *hA, magma_int_t lda, magmaFloatComplex_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs. More...

void magma_csetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaFloatComplex *hA, magma_int_t lda, magmaFloatComplex_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs. More...

void magma_dsetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const double *hA, magma_int_t lda, magmaDouble_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs. More...

void magma_dsetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const double *hA, magma_int_t lda, magmaDouble_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs. More...

void magma_ssetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const float *hA, magma_int_t lda, magmaFloat_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs. More...

void magma_ssetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const float *hA, magma_int_t lda, magmaFloat_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs. More...

void magma_zsetmatrix_1D_col_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaDoubleComplex *hA, magma_int_t lda, magmaDoubleComplex_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs. More...

void magma_zsetmatrix_1D_row_bcyclic (magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaDoubleComplex *hA, magma_int_t lda, magmaDoubleComplex_ptr *dA, magma_int_t ldda, magma_queue_t queues[])
Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs. More...

## Function Documentation

 void magma_csetmatrix_1D_col_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaFloatComplex * hA, magma_int_t lda, magmaFloatComplex_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU. [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= m. [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_csetmatrix_1D_row_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaFloatComplex * hA, magma_int_t lda, magmaFloatComplex_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n). [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_dsetmatrix_1D_col_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const double * hA, magma_int_t lda, magmaDouble_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU. [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= m. [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_dsetmatrix_1D_row_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const double * hA, magma_int_t lda, magmaDouble_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n). [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_ssetmatrix_1D_col_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const float * hA, magma_int_t lda, magmaFloat_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU. [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= m. [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_ssetmatrix_1D_row_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const float * hA, magma_int_t lda, magmaFloat_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n). [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_zsetmatrix_1D_col_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaDoubleComplex * hA, magma_int_t lda, magmaDoubleComplex_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D column block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,nlocal), where nlocal is the columns assigned to each GPU. [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= m. [in] queues Array of dimension (ngpu), with one queue per GPU.
 void magma_zsetmatrix_1D_row_bcyclic ( magma_int_t ngpu, magma_int_t m, magma_int_t n, magma_int_t nb, const magmaDoubleComplex * hA, magma_int_t lda, magmaDoubleComplex_ptr * dA, magma_int_t ldda, magma_queue_t queues[] )

Copy matrix hA on CPU host to dA, which is distributed 1D row block cyclic over multiple GPUs.

Parameters
 [in] ngpu Number of GPUs over which dAT is distributed. [in] m Number of rows of matrix hA. m >= 0. [in] n Number of columns of matrix hA. n >= 0. [in] nb Block size. nb > 0. [in] hA The m-by-n matrix A on the CPU, of dimension (lda,n). [in] lda Leading dimension of matrix hA. lda >= m. [out] dA Array of ngpu pointers, one per GPU, that store the disributed m-by-n matrix A on the GPUs, each of dimension (ldda,n). [in] ldda Leading dimension of each matrix dAT on each GPU. ldda >= (1 + m/(nb*ngpu))*nb [in] queues Array of dimension (ngpu), with one queue per GPU.