MAGMA
2.3.0
Matrix Algebra for GPU and Multicore Architectures

Functions  
magma_int_t  magma_cgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_cgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_cgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_dgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dtau, magmaDouble_ptr dT, magmaDouble_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_dgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dtau, magmaDouble_ptr dT, magmaDouble_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_dgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dtau, magmaDouble_ptr dT, magmaDouble_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_sgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_sgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_sgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_int_t *info) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_zgeqr2x2_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dtau, magmaDoubleComplex_ptr dT, magmaDoubleComplex_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_zgeqr2x3_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dtau, magmaDoubleComplex_ptr dT, magmaDoubleComplex_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_zgeqr2x_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dtau, magmaDoubleComplex_ptr dT, magmaDoubleComplex_ptr ddA, magmaDouble_ptr dwork, magma_int_t *info) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_cgeqr2_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R using the nonblocking Householder QR. More...  
magma_int_t  magma_cgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaFloatComplex_ptr dA, magma_int_t ldda, magmaFloatComplex_ptr dtau, magmaFloatComplex_ptr dT, magmaFloatComplex_ptr ddA, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t  magma_dgeqr2_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dtau, magmaDouble_ptr dwork, magma_queue_t queue, magma_int_t *info) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R using the nonblocking Householder QR. More...  
magma_int_t  magma_dgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dtau, magmaDouble_ptr dT, magmaDouble_ptr ddA, magmaDouble_ptr dwork, magma_queue_t queue, magma_int_t *info) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_sgeqr2_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R using the nonblocking Householder QR. More...  
magma_int_t  magma_sgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dtau, magmaFloat_ptr dT, magmaFloat_ptr ddA, magmaFloat_ptr dwork, magma_queue_t queue, magma_int_t *info) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R. More...  
magma_int_t  magma_zgeqr2_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dtau, magmaDouble_ptr dwork, magma_queue_t queue, magma_int_t *info) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R using the nonblocking Householder QR. More...  
magma_int_t  magma_zgeqr2x4_gpu (magma_int_t m, magma_int_t n, magmaDoubleComplex_ptr dA, magma_int_t ldda, magmaDoubleComplex_ptr dtau, magmaDoubleComplex_ptr dT, magmaDoubleComplex_ptr ddA, magmaDouble_ptr dwork, magma_queue_t queue, magma_int_t *info) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R. More...  
magma_int_t magma_cgeqr2x2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloatComplex_ptr  dA,  
magma_int_t  ldda,  
magmaFloatComplex_ptr  dtau,  
magmaFloatComplex_ptr  dT,  
magmaFloatComplex_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqr2x3_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloatComplex_ptr  dA,  
magma_int_t  ldda,  
magmaFloatComplex_ptr  dtau,  
magmaFloatComplex_ptr  dT,  
magmaFloatComplex_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqr2x_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloatComplex_ptr  dA,  
magma_int_t  ldda,  
magmaFloatComplex_ptr  dtau,  
magmaFloatComplex_ptr  dT,  
magmaFloatComplex_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the rightlooking QR. A hardcoded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128). 
[in,out]  dA  COMPLEX array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) COMPLEX array, dimension (N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqr2x2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDouble_ptr  dA,  
magma_int_t  ldda,  
magmaDouble_ptr  dtau,  
magmaDouble_ptr  dT,  
magmaDouble_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard dgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's dgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  DOUBLE PRECISION array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  DOUBLE PRECISION array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  DOUBLE PRECISION array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqr2x3_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDouble_ptr  dA,  
magma_int_t  ldda,  
magmaDouble_ptr  dtau,  
magmaDouble_ptr  dT,  
magmaDouble_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard dgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's dgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  DOUBLE PRECISION array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  DOUBLE PRECISION array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  DOUBLE PRECISION array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqr2x_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDouble_ptr  dA,  
magma_int_t  ldda,  
magmaDouble_ptr  dtau,  
magmaDouble_ptr  dT,  
magmaDouble_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard dgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's dgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the rightlooking QR. A hardcoded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128). 
[in,out]  dA  DOUBLE PRECISION array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  DOUBLE PRECISION array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  DOUBLE PRECISION array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloat_ptr  dA,  
magma_int_t  ldda,  
magmaFloat_ptr  dtau,  
magmaFloat_ptr  dT,  
magmaFloat_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  REAL array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x3_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloat_ptr  dA,  
magma_int_t  ldda,  
magmaFloat_ptr  dtau,  
magmaFloat_ptr  dT,  
magmaFloat_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  REAL array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloat_ptr  dA,  
magma_int_t  ldda,  
magmaFloat_ptr  dtau,  
magmaFloat_ptr  dT,  
magmaFloat_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_int_t *  info  
) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the rightlooking QR. A hardcoded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128). 
[in,out]  dA  REAL array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqr2x2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDoubleComplex_ptr  dA,  
magma_int_t  ldda,  
magmaDoubleComplex_ptr  dtau,  
magmaDoubleComplex_ptr  dT,  
magmaDoubleComplex_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX_16 array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqr2x3_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDoubleComplex_ptr  dA,  
magma_int_t  ldda,  
magmaDoubleComplex_ptr  dtau,  
magmaDoubleComplex_ptr  dT,  
magmaDoubleComplex_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX_16 array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqr2x_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDoubleComplex_ptr  dA,  
magma_int_t  ldda,  
magmaDoubleComplex_ptr  dtau,  
magmaDoubleComplex_ptr  dT,  
magmaDoubleComplex_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_int_t *  info  
) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R.
This version implements the rightlooking QR. A hardcoded requirement for N is to be <= min(M, 128). For larger N one should use a blocking QR version.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. 0 <= N <= min(M, 128). 
[in,out]  dA  COMPLEX_16 array, dimension (LDDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDDA >= max(1,M). 
[out]  dtau  COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) COMPLEX_16 array, dimension (N)  
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v'
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqr2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloatComplex_ptr  dA,  
magma_int_t  ldda,  
magmaFloatComplex_ptr  dtau,  
magmaFloat_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R using the nonblocking Householder QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
dwork  (workspace) REAL array, dimension (N)  
[in]  queue  magma_queue_t Queue to execute in. 
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_cgeqr2x4_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloatComplex_ptr  dA,  
magma_int_t  ldda,  
magmaFloatComplex_ptr  dtau,  
magmaFloatComplex_ptr  dT,  
magmaFloatComplex_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
CGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard cgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's cgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  COMPLEX array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

[in]  queue  magma_queue_t Queue to execute in. 
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqr2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDouble_ptr  dA,  
magma_int_t  ldda,  
magmaDouble_ptr  dtau,  
magmaDouble_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R using the nonblocking Householder QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  DOUBLE PRECISION array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
dwork  (workspace) DOUBLE PRECISION array, dimension (N)  
[in]  queue  magma_queue_t Queue to execute in. 
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_dgeqr2x4_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDouble_ptr  dA,  
magma_int_t  ldda,  
magmaDouble_ptr  dtau,  
magmaDouble_ptr  dT,  
magmaDouble_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
DGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard dgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's dgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  DOUBLE PRECISION array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  DOUBLE PRECISION array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  DOUBLE PRECISION array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  DOUBLE PRECISION array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

[in]  queue  magma_queue_t Queue to execute in. 
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloat_ptr  dA,  
magma_int_t  ldda,  
magmaFloat_ptr  dtau,  
magmaFloat_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R using the nonblocking Householder QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
dwork  (workspace) REAL array, dimension (N)  
[in]  queue  magma_queue_t Queue to execute in. 
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_sgeqr2x4_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaFloat_ptr  dA,  
magma_int_t  ldda,  
magmaFloat_ptr  dtau,  
magmaFloat_ptr  dT,  
magmaFloat_ptr  ddA,  
magmaFloat_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
SGEQR2 computes a QR factorization of a real m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard sgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's sgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  REAL array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the orthogonal matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the orthogonal matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  REAL array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  REAL array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  REAL array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) REAL array, dimension (3 N)  
[out]  info  INTEGER

[in]  queue  magma_queue_t Queue to execute in. 
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a real scalar, and v is a real vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqr2_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDoubleComplex_ptr  dA,  
magma_int_t  ldda,  
magmaDoubleComplex_ptr  dtau,  
magmaDouble_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R using the nonblocking Householder QR.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX*16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  COMPLEX*16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
dwork  (workspace) DOUBLE PRECISION array, dimension (N)  
[in]  queue  magma_queue_t Queue to execute in. 
[out]  info  INTEGER

The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).
magma_int_t magma_zgeqr2x4_gpu  (  magma_int_t  m, 
magma_int_t  n,  
magmaDoubleComplex_ptr  dA,  
magma_int_t  ldda,  
magmaDoubleComplex_ptr  dtau,  
magmaDoubleComplex_ptr  dT,  
magmaDoubleComplex_ptr  ddA,  
magmaDouble_ptr  dwork,  
magma_queue_t  queue,  
magma_int_t *  info  
) 
ZGEQR2 computes a QR factorization of a complex m by n matrix A: A = Q * R.
This expert routine requires two more arguments than the standard zgeqr2, namely, dT and ddA, explained below. The storage for A is also not as in the LAPACK's zgeqr2 routine (see below).
The first is used to output the triangular n x n factor T of the block reflector used in the factorization. The second holds the diagonal nxn blocks of A, i.e., the diagonal submatrices of R. This routine implements the left looking QR.
This version adds internal blocking.
[in]  m  INTEGER The number of rows of the matrix A. M >= 0. 
[in]  n  INTEGER The number of columns of the matrix A. N >= 0. 
[in,out]  dA  COMPLEX_16 array, dimension (LDA,N) On entry, the m by n matrix A. On exit, the unitary matrix Q as a product of elementary reflectors (see Further Details). the elements on and above the diagonal of the array contain the min(m,n) by n upper trapezoidal matrix R (R is upper triangular if m >= n); the elements below the diagonal, with the array TAU, represent the unitary matrix Q as a product of elementary reflectors (see Further Details). 
[in]  ldda  INTEGER The leading dimension of the array A. LDA >= max(1,M). 
[out]  dtau  COMPLEX_16 array, dimension (min(M,N)) The scalar factors of the elementary reflectors (see Further Details). 
[out]  dT  COMPLEX_16 array, dimension N x N. Stores the triangular N x N factor T of the block reflector used in the factorization. The lower triangular part is 0. 
[out]  ddA  COMPLEX_16 array, dimension N x N. Stores the elements of the upper N x N diagonal block of A. LAPACK stores this array in A. There are 0s below the diagonal. 
dwork  (workspace) DOUBLE PRECISION array, dimension (3 N)  
[out]  info  INTEGER

[in]  queue  magma_queue_t Queue to execute in. 
The matrix Q is represented as a product of elementary reflectors
Q = H(1) H(2) . . . H(k), where k = min(m,n).
Each H(i) has the form
H(i) = I  tau * v * v**H
where tau is a complex scalar, and v is a complex vector with v(1:i1) = 0 and v(i) = 1; v(i+1:m) is stored on exit in A(i+1:m,i), and tau in TAU(i).