MAGMA  2.3.0 Matrix Algebra for GPU and Multicore Architectures
or/unmqr: Multiply by Q from QR factorization

Functions

magma_int_t magma_cunmqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *C, magma_int_t ldc, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
CUNMQR overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_cunmqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex *tau, magmaFloatComplex_ptr dC, magma_int_t lddc, const magmaFloatComplex *wA, magma_int_t ldwa, magma_int_t *info)
CUNMQR overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_cunmqr_2stage_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info)
CUNMQR_GPU overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_cunmqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_const_ptr dA, magma_int_t ldda, magmaFloatComplex const *tau, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex *hwork, magma_int_t lwork, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t *info)
CUNMQR_GPU overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_cunmqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex *A, magma_int_t lda, magmaFloatComplex *tau, magmaFloatComplex *C, magma_int_t ldc, magmaFloatComplex *work, magma_int_t lwork, magma_int_t *info)
CUNMQR overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_dormqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, double *C, magma_int_t ldc, double *work, magma_int_t lwork, magma_int_t *info)
DORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_dormqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, double *tau, magmaDouble_ptr dC, magma_int_t lddc, const double *wA, magma_int_t ldwa, magma_int_t *info)
DORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_dormqr_2stage_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dC, magma_int_t lddc, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
DORMQR_GPU overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_dormqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_const_ptr dA, magma_int_t ldda, double const *tau, magmaDouble_ptr dC, magma_int_t lddc, double *hwork, magma_int_t lwork, magmaDouble_ptr dT, magma_int_t nb, magma_int_t *info)
DORMQR_GPU overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_dormqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double *A, magma_int_t lda, double *tau, double *C, magma_int_t ldc, double *work, magma_int_t lwork, magma_int_t *info)
DORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_sormqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, float *C, magma_int_t ldc, float *work, magma_int_t lwork, magma_int_t *info)
SORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_sormqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, float *tau, magmaFloat_ptr dC, magma_int_t lddc, const float *wA, magma_int_t ldwa, magma_int_t *info)
SORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_sormqr_2stage_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dC, magma_int_t lddc, magmaFloat_ptr dT, magma_int_t nb, magma_int_t *info)
SORMQR_GPU overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_sormqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_const_ptr dA, magma_int_t ldda, float const *tau, magmaFloat_ptr dC, magma_int_t lddc, float *hwork, magma_int_t lwork, magmaFloat_ptr dT, magma_int_t nb, magma_int_t *info)
SORMQR_GPU overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_sormqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float *A, magma_int_t lda, float *tau, float *C, magma_int_t ldc, float *work, magma_int_t lwork, magma_int_t *info)
SORMQR overwrites the general real M-by-N matrix C with. More...

magma_int_t magma_zunmqr (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
ZUNMQR overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_zunmqr2_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex *tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, const magmaDoubleComplex *wA, magma_int_t ldwa, magma_int_t *info)
ZUNMQR overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_zunmqr_2stage_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dC, magma_int_t lddc, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t *info)
ZUNMQR_GPU overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_zunmqr_gpu (magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex const *tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, magmaDoubleComplex *hwork, magma_int_t lwork, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t *info)
ZUNMQR_GPU overwrites the general complex M-by-N matrix C with. More...

magma_int_t magma_zunmqr_m (magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex *A, magma_int_t lda, magmaDoubleComplex *tau, magmaDoubleComplex *C, magma_int_t ldc, magmaDoubleComplex *work, magma_int_t lwork, magma_int_t *info)
ZUNMQR overwrites the general complex M-by-N matrix C with. More...

Function Documentation

 magma_int_t magma_cunmqr ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex * A, magma_int_t lda, magmaFloatComplex * tau, magmaFloatComplex * C, magma_int_t ldc, magmaFloatComplex * work, magma_int_t lwork, magma_int_t * info )

CUNMQR overwrites the general complex M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = Magma_ConjTrans:  Q**H * C           C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A COMPLEX array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF. [in,out] C COMPLEX array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_cunmqr2_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex * tau, magmaFloatComplex_ptr dC, magma_int_t lddc, const magmaFloatComplex * wA, magma_int_t ldwa, magma_int_t * info )

CUNMQR overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in,out] dA COMPLEX array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF. [in,out] dC COMPLEX array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array dC. LDDC >= max(1,M). [in] wA COMPLEX array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by CHETRD_GPU. (A copy of the upper or lower part of dA, on the host.) [in] ldwa INTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N). [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_cunmqr_2stage_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t * info )

CUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA COMPLEX array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in,out] dC COMPLEX array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [in] dT COMPLEX array on the GPU that is the output (the 9th argument) of magma_cgeqrf_gpu. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_cgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_cunmqr_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex_const_ptr dA, magma_int_t ldda, magmaFloatComplex const * tau, magmaFloatComplex_ptr dC, magma_int_t lddc, magmaFloatComplex * hwork, magma_int_t lwork, magmaFloatComplex_ptr dT, magma_int_t nb, magma_int_t * info )

CUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA COMPLEX array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF. [in,out] dC COMPLEX array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [out] hwork (workspace) COMPLEX array, dimension (MAX(1,LWORK)) Currently, cgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! [in] lwork INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. [in,out] dT COMPLEX array on the GPU that is the output (the 9th argument) of magma_cgeqrf_gpu. Part used as workspace. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_cgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_cunmqr_m ( magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloatComplex * A, magma_int_t lda, magmaFloatComplex * tau, magmaFloatComplex * C, magma_int_t ldc, magmaFloatComplex * work, magma_int_t lwork, magma_int_t * info )

CUNMQR overwrites the general complex M-by-N matrix C with.

                            SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:       Q * C               C * Q
TRANS = Magma_ConjTrans:    Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by CGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] ngpu INTEGER Number of GPUs to use. ngpu > 0. [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A COMPLEX array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by CGEQRF in the first k columns of its array argument A. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau COMPLEX array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by CGEQRF. [in,out] C COMPLEX array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) COMPLEX array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_dormqr ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double * A, magma_int_t lda, double * tau, double * C, magma_int_t ldc, double * work, magma_int_t lwork, magma_int_t * info )

DORMQR overwrites the general real M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = MagmaTrans:  Q**H * C           C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A DOUBLE PRECISION array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau DOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF. [in,out] C DOUBLE PRECISION array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_dormqr2_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, double * tau, magmaDouble_ptr dC, magma_int_t lddc, const double * wA, magma_int_t ldwa, magma_int_t * info )

DORMQR overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in,out] dA DOUBLE PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau DOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF. [in,out] dC DOUBLE PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array dC. LDDC >= max(1,M). [in] wA DOUBLE PRECISION array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by DSYTRD_GPU. (A copy of the upper or lower part of dA, on the host.) [in] ldwa INTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N). [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_dormqr_2stage_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dC, magma_int_t lddc, magmaDouble_ptr dT, magma_int_t nb, magma_int_t * info )

DORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA DOUBLE PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in,out] dC DOUBLE PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [in] dT DOUBLE PRECISION array on the GPU that is the output (the 9th argument) of magma_dgeqrf_gpu. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_dgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_dormqr_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDouble_const_ptr dA, magma_int_t ldda, double const * tau, magmaDouble_ptr dC, magma_int_t lddc, double * hwork, magma_int_t lwork, magmaDouble_ptr dT, magma_int_t nb, magma_int_t * info )

DORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA DOUBLE PRECISION array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau DOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF. [in,out] dC DOUBLE PRECISION array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [out] hwork (workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) Currently, dgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! [in] lwork INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. [in,out] dT DOUBLE PRECISION array on the GPU that is the output (the 9th argument) of magma_dgeqrf_gpu. Part used as workspace. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_dgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_dormqr_m ( magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, double * A, magma_int_t lda, double * tau, double * C, magma_int_t ldc, double * work, magma_int_t lwork, magma_int_t * info )

DORMQR overwrites the general real M-by-N matrix C with.

                            SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:       Q * C               C * Q
TRANS = MagmaTrans:    Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by DGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] ngpu INTEGER Number of GPUs to use. ngpu > 0. [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A DOUBLE PRECISION array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by DGEQRF in the first k columns of its array argument A. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau DOUBLE PRECISION array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by DGEQRF. [in,out] C DOUBLE PRECISION array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_sormqr ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float * A, magma_int_t lda, float * tau, float * C, magma_int_t ldc, float * work, magma_int_t lwork, magma_int_t * info )

SORMQR overwrites the general real M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = MagmaTrans:  Q**H * C           C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A REAL array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. [in,out] C REAL array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_sormqr2_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, float * tau, magmaFloat_ptr dC, magma_int_t lddc, const float * wA, magma_int_t ldwa, magma_int_t * info )

SORMQR overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in,out] dA REAL array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. [in,out] dC REAL array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array dC. LDDC >= max(1,M). [in] wA REAL array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by SSYTRD_GPU. (A copy of the upper or lower part of dA, on the host.) [in] ldwa INTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N). [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_sormqr_2stage_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dC, magma_int_t lddc, magmaFloat_ptr dT, magma_int_t nb, magma_int_t * info )

SORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA REAL array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in,out] dC REAL array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [in] dT REAL array on the GPU that is the output (the 9th argument) of magma_sgeqrf_gpu. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_sgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_sormqr_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaFloat_const_ptr dA, magma_int_t ldda, float const * tau, magmaFloat_ptr dC, magma_int_t lddc, float * hwork, magma_int_t lwork, magmaFloat_ptr dT, magma_int_t nb, magma_int_t * info )

SORMQR_GPU overwrites the general real M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = MagmaTrans:   Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA REAL array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. [in,out] dC REAL array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [out] hwork (workspace) REAL array, dimension (MAX(1,LWORK)) Currently, sgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! [in] lwork INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. [in,out] dT REAL array on the GPU that is the output (the 9th argument) of magma_sgeqrf_gpu. Part used as workspace. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_sgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_sormqr_m ( magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, float * A, magma_int_t lda, float * tau, float * C, magma_int_t ldc, float * work, magma_int_t lwork, magma_int_t * info )

SORMQR overwrites the general real M-by-N matrix C with.

                            SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:       Q * C               C * Q
TRANS = MagmaTrans:    Q**H * C            C * Q**H


where Q is a real orthogonal matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by SGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] ngpu INTEGER Number of GPUs to use. ngpu > 0. [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = MagmaTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A REAL array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by SGEQRF in the first k columns of its array argument A. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau REAL array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by SGEQRF. [in,out] C REAL array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) REAL array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_zunmqr ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex * A, magma_int_t lda, magmaDoubleComplex * tau, magmaDoubleComplex * C, magma_int_t ldc, magmaDoubleComplex * work, magma_int_t lwork, magma_int_t * info )

ZUNMQR overwrites the general complex M-by-N matrix C with.

                          SIDE = MagmaLeft   SIDE = MagmaRight
TRANS = MagmaNoTrans:     Q * C              C * Q
TRANS = Magma_ConjTrans:  Q**H * C           C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A COMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. A is modified by the routine but restored on exit. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. [in,out] C COMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance if SIDE = MagmaLeft, LWORK >= N*NB; if SIDE = MagmaRight, LWORK >= M*NB, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_zunmqr2_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex * tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, const magmaDoubleComplex * wA, magma_int_t ldwa, magma_int_t * info )

ZUNMQR overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in,out] dA COMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument dA. The diagonal and the upper part are destroyed, the reflectors are not modified. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. [in,out] dC COMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array dC. LDDC >= max(1,M). [in] wA COMPLEX_16 array, dimension (LDWA,M) if SIDE = MagmaLeft (LDWA,N) if SIDE = MagmaRight The vectors which define the elementary reflectors, as returned by ZHETRD_GPU. (A copy of the upper or lower part of dA, on the host.) [in] ldwa INTEGER The leading dimension of the array wA. If SIDE = MagmaLeft, LDWA >= max(1,M); if SIDE = MagmaRight, LDWA >= max(1,N). [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_zunmqr_2stage_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dC, magma_int_t lddc, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t * info )

ZUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

Q = H(1) H(2) . . . H(k)


as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA COMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument DA. DA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array DA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in,out] dC COMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H * C or C * Q**H or C*Q. [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [in] dT COMPLEX_16 array on the GPU that is the output (the 9th argument) of magma_zgeqrf_gpu. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_zgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_zunmqr_gpu ( magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex_const_ptr dA, magma_int_t ldda, magmaDoubleComplex const * tau, magmaDoubleComplex_ptr dC, magma_int_t lddc, magmaDoubleComplex * hwork, magma_int_t lwork, magmaDoubleComplex_ptr dT, magma_int_t nb, magma_int_t * info )

ZUNMQR_GPU overwrites the general complex M-by-N matrix C with.

                           SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:      Q * C               C * Q
TRANS = Magma_ConjTrans:   Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] dA COMPLEX_16 array on the GPU, dimension (LDDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument dA. dA is modified by the routine but restored on exit. [in] ldda INTEGER The leading dimension of the array dA. If SIDE = MagmaLeft, LDDA >= max(1,M); if SIDE = MagmaRight, LDDA >= max(1,N). [in] tau COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. [in,out] dC COMPLEX_16 array on the GPU, dimension (LDDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by (Q*C) or (Q**H * C) or (C * Q**H) or (C*Q). [in] lddc INTEGER The leading dimension of the array DC. LDDC >= max(1,M). [out] hwork (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) Currently, zgetrs_gpu assumes that on exit, hwork contains the last block of A and C. This will change and should not be relied on! [in] lwork INTEGER The dimension of the array HWORK. LWORK >= (M-K+NB)*(N+NB) + N*NB if SIDE = MagmaLeft, and LWORK >= (N-K+NB)*(M+NB) + M*NB if SIDE = MagmaRight, where NB is the given blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the HWORK array, returns this value as the first entry of the HWORK array, and no error message related to LWORK is issued by XERBLA. [in,out] dT COMPLEX_16 array on the GPU that is the output (the 9th argument) of magma_zgeqrf_gpu. Part used as workspace. [in] nb INTEGER This is the blocking size that was used in pre-computing DT, e.g., the blocking size used in magma_zgeqrf_gpu. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value
 magma_int_t magma_zunmqr_m ( magma_int_t ngpu, magma_side_t side, magma_trans_t trans, magma_int_t m, magma_int_t n, magma_int_t k, magmaDoubleComplex * A, magma_int_t lda, magmaDoubleComplex * tau, magmaDoubleComplex * C, magma_int_t ldc, magmaDoubleComplex * work, magma_int_t lwork, magma_int_t * info )

ZUNMQR overwrites the general complex M-by-N matrix C with.

                            SIDE = MagmaLeft    SIDE = MagmaRight
TRANS = MagmaNoTrans:       Q * C               C * Q
TRANS = Magma_ConjTrans:    Q**H * C            C * Q**H


where Q is a complex unitary matrix defined as the product of k elementary reflectors

  Q = H(1) H(2) . . . H(k)


as returned by ZGEQRF. Q is of order M if SIDE = MagmaLeft and of order N if SIDE = MagmaRight.

Parameters
 [in] ngpu INTEGER Number of GPUs to use. ngpu > 0. [in] side magma_side_t = MagmaLeft: apply Q or Q**H from the Left; = MagmaRight: apply Q or Q**H from the Right. [in] trans magma_trans_t = MagmaNoTrans: No transpose, apply Q; = Magma_ConjTrans: Conjugate transpose, apply Q**H. [in] m INTEGER The number of rows of the matrix C. M >= 0. [in] n INTEGER The number of columns of the matrix C. N >= 0. [in] k INTEGER The number of elementary reflectors whose product defines the matrix Q. If SIDE = MagmaLeft, M >= K >= 0; if SIDE = MagmaRight, N >= K >= 0. [in] A COMPLEX_16 array, dimension (LDA,K) The i-th column must contain the vector which defines the elementary reflector H(i), for i = 1,2,...,k, as returned by ZGEQRF in the first k columns of its array argument A. [in] lda INTEGER The leading dimension of the array A. If SIDE = MagmaLeft, LDA >= max(1,M); if SIDE = MagmaRight, LDA >= max(1,N). [in] tau COMPLEX_16 array, dimension (K) TAU(i) must contain the scalar factor of the elementary reflector H(i), as returned by ZGEQRF. [in,out] C COMPLEX_16 array, dimension (LDC,N) On entry, the M-by-N matrix C. On exit, C is overwritten by Q*C or Q**H*C or C*Q**H or C*Q. [in] ldc INTEGER The leading dimension of the array C. LDC >= max(1,M). [out] work (workspace) COMPLEX_16 array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK[0] returns the optimal LWORK. [in] lwork INTEGER The dimension of the array WORK. If SIDE = MagmaLeft, LWORK >= max(1,N); if SIDE = MagmaRight, LWORK >= max(1,M). For optimum performance LWORK >= N*NB if SIDE = MagmaLeft, and LWORK >= M*NB if SIDE = MagmaRight, where NB is the optimal blocksize. If LWORK = -1, then a workspace query is assumed; the routine only calculates the optimal size of the WORK array, returns this value as the first entry of the WORK array, and no error message related to LWORK is issued by XERBLA. [out] info INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value