MAGMA  2.3.0 Matrix Algebra for GPU and Multicore Architectures
gesdd: SVD using divide-and-conquer

## Functions

magma_int_t magma_cgesdd (magma_vec_t jobz, magma_int_t m, magma_int_t n, magmaFloatComplex *A, magma_int_t lda, float *s, magmaFloatComplex *U, magma_int_t ldu, magmaFloatComplex *VT, magma_int_t ldvt, magmaFloatComplex *work, magma_int_t lwork, float *rwork, magma_int_t *iwork, magma_int_t *info)
CGESDD computes the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method. More...

magma_int_t magma_dgesdd (magma_vec_t jobz, magma_int_t m, magma_int_t n, double *A, magma_int_t lda, double *s, double *U, magma_int_t ldu, double *VT, magma_int_t ldvt, double *work, magma_int_t lwork, magma_int_t *iwork, magma_int_t *info)
DGESDD computes the singular value decomposition (SVD) of a real M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method. More...

magma_int_t magma_sgesdd (magma_vec_t jobz, magma_int_t m, magma_int_t n, float *A, magma_int_t lda, float *s, float *U, magma_int_t ldu, float *VT, magma_int_t ldvt, float *work, magma_int_t lwork, magma_int_t *iwork, magma_int_t *info)
SGESDD computes the singular value decomposition (SVD) of a real M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method. More...

magma_int_t magma_zgesdd (magma_vec_t jobz, magma_int_t m, magma_int_t n, magmaDoubleComplex *A, magma_int_t lda, double *s, magmaDoubleComplex *U, magma_int_t ldu, magmaDoubleComplex *VT, magma_int_t ldvt, magmaDoubleComplex *work, magma_int_t lwork, double *rwork, magma_int_t *iwork, magma_int_t *info)
ZGESDD computes the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method. More...

## Function Documentation

 magma_int_t magma_cgesdd ( magma_vec_t jobz, magma_int_t m, magma_int_t n, magmaFloatComplex * A, magma_int_t lda, float * s, magmaFloatComplex * U, magma_int_t ldu, magmaFloatComplex * VT, magma_int_t ldvt, magmaFloatComplex * work, magma_int_t lwork, float * rwork, magma_int_t * iwork, magma_int_t * info )

CGESDD computes the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method.

The SVD is written

A = U * SIGMA * conjugate-transpose(V)

where SIGMA is an M-by-N matrix which is zero except for its min(m,n) diagonal elements, U is an M-by-M unitary matrix, and V is an N-by-N unitary matrix. The diagonal elements of SIGMA are the singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.

Note that the routine returns VT = V**H, not V.

The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work on machines with a guard digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.

Parameters
 [in] jobz magma_vec_t Specifies options for computing all or part of the matrix U: = MagmaAllVec: all M columns of U and all N rows of V**H are returned in the arrays U and VT; = MagmaSomeVec: the first min(M,N) columns of U and the first min(M,N) rows of V**H are returned in the arrays U and VT; = MagmaOverwriteVec: If M >= N, the first N columns of U are overwritten on the array A and all rows of V**H are returned in the array VT; otherwise, all columns of U are returned in the array U and the first M rows of V**H are overwritten on the array A; = MagmaNoVec: no columns of U or rows of V**H are computed. [in] m INTEGER The number of rows of the input matrix A. M >= 0. [in] n INTEGER The number of columns of the input matrix A. N >= 0. [in,out] A COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, if JOBZ = MagmaOverwriteVec, if M >= N, A is overwritten with the first N columns of U (the left singular vectors, stored columnwise); otherwise, A is overwritten with the first M rows of V**H (the right singular vectors, stored rowwise). if JOBZ != MagmaOverwriteVec, the contents of A are destroyed. [in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] s REAL array, dimension (min(M,N)) The singular values of A, sorted so that S(i) >= S(i+1). [out] U COMPLEX array, dimension (LDU,UCOL) UCOL = M if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N; UCOL = min(M,N) if JOBZ = MagmaSomeVec. If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, U contains the M-by-M unitary matrix U; if JOBZ = MagmaSomeVec, U contains the first min(M,N) columns of U (the left singular vectors, stored columnwise); if JOBZ = MagmaOverwriteVec and M >= N, or JOBZ = MagmaNoVec, U is not referenced. [in] ldu INTEGER The leading dimension of the array U. LDU >= 1; if JOBZ = MagmaSomeVec or MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, LDU >= M. [out] VT COMPLEX array, dimension (LDVT,N) If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, VT contains the N-by-N unitary matrix V**H; if JOBZ = MagmaSomeVec, VT contains the first min(M,N) rows of V**H (the right singular vectors, stored rowwise); if JOBZ = MagmaOverwriteVec and M < N, or JOBZ = MagmaNoVec, VT is not referenced. [in] ldvt INTEGER The leading dimension of the array VT. LDVT >= 1; if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, LDVT >= N; if JOBZ = MagmaSomeVec, LDVT >= min(M,N). [out] work (workspace) COMPLEX array, dimension (MAX(1,lwork)) On exit, if INFO = 0, WORK[0] returns the optimal lwork. [in] lwork INTEGER The dimension of the array WORK. If lwork = -1, a workspace query is assumed. The optimal size for the WORK array is calculated and stored in WORK[0], and no other work except argument checking is performed. Let mx = max(M,N) and mn = min(M,N). The threshold for mx >> mn is currently mx >= mn*17/9. For job: N=None, O=Overwrite, S=Some, A=All. Because of varying nb for different subroutines, formulas below are an upper bound. Querying gives an exact number. The optimal block size nb can be obtained through magma_get_cgesvd_nb(M,N). Optimal lwork (required in MAGMA) for mx >> mn: Path 1: jobz=N 2*mn + 2*mn*nb Path 2: jobz=O 2*mn*mn + 2*mn + 2*mn*nb or mx*mn + mn*mn + 2*mn + 2*mn*nb [marginally faster?] Path 3: jobz=S mn*mn + 2*mn + 2*mn*nb Path 4: jobz=A mn*mn + max( 2*mn + 2*mn*nb, mn + mx*nb ) for mx >= mn, but not mx >> mn: Path 5,6: jobz=N 2*mn + (mx + mn)*nb jobz=O mx*mn + 2*mn + (mx + mn)*nb [faster algorithm] or mn*mn + 2*mn + (mx + mn)*nb [slower algorithm] jobz=S 2*mn + (mx + mn)*nb jobz=A 2*mn + (mx + mn)*nb MAGMA requires the optimal sizes above, while LAPACK has the same optimal sizes but the minimum sizes below. LAPACK minimum lwork for mx >> mn: Path 1: jobz=N 3*mn Path 2: jobz=O 2*mn*mn + 3*mn Path 3: jobz=S mn*mn + 3*mn Path 4: jobz=A mn*mn + 2*mn + mx # LAPACK's overestimate or mn*mn + max( m + n, 3*n ) # correct minimum for mx >= mn, but not mx >> mn: Path 5,6: jobz=N 2*mn + mx jobz=O mn*mn + 2*mn + mx jobz=S 2*mn + mx jobz=A 2*mn + mx rwork (workspace) REAL array, dimension (MAX(1,LRWORK)) Let mx = max(M,N) and mn = min(M,N). These sizes should work for both MAGMA and LAPACK. If JOBZ = MagmaNoVec, LRWORK >= 5*mn. # LAPACK <= 3.6 had bug requiring 7*mn If JOBZ != MagmaNoVec, if mx >> mn, LRWORK >= 5*mn*mn + 5*mn; otherwise, LRWORK >= max( 5*mn*mn + 5*mn, 2*mx*mn + 2*mn*mn + mn ). For JOBZ = MagmaNoVec, some implementations seem to have a bug requiring LRWORK >= 7*mn in some cases. iwork (workspace) INTEGER array, dimension (8*min(M,N)) [out] info INTEGER = 0: successful exit. < 0: if INFO = -i, the i-th argument had an illegal value. > 0: The updating process of SBDSDC did not converge.

## Further Details

Based on contributions by Ming Gu and Huan Ren, Computer Science Division, University of California at Berkeley, USA

 magma_int_t magma_dgesdd ( magma_vec_t jobz, magma_int_t m, magma_int_t n, double * A, magma_int_t lda, double * s, double * U, magma_int_t ldu, double * VT, magma_int_t ldvt, double * work, magma_int_t lwork, magma_int_t * iwork, magma_int_t * info )

DGESDD computes the singular value decomposition (SVD) of a real M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method.

The SVD is written

A = U * SIGMA * transpose(V)

where SIGMA is an M-by-N matrix which is zero except for its min(m,n) diagonal elements, U is an M-by-M orthogonal matrix, and V is an N-by-N orthogonal matrix. The diagonal elements of SIGMA are the singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.

Note that the routine returns VT = V**T, not V.

The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work on machines with a guard digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.

Parameters
 [in] jobz magma_vec_t Specifies options for computing all or part of the matrix U: = MagmaAllVec: all M columns of U and all N rows of V**T are returned in the arrays U and VT; = MagmaSomeVec: the first min(M,N) columns of U and the first min(M,N) rows of V**T are returned in the arrays U and VT; = MagmaOverwriteVec: If M >= N, the first N columns of U are overwritten on the array A and all rows of V**T are returned in the array VT; otherwise, all columns of U are returned in the array U and the first M rows of V**T are overwritten on the array A; = MagmaNoVec: no columns of U or rows of V**T are computed. [in] m INTEGER The number of rows of the input matrix A. M >= 0. [in] n INTEGER The number of columns of the input matrix A. N >= 0. [in,out] A DOUBLE PRECISION array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, if JOBZ = MagmaOverwriteVec, if M >= N, A is overwritten with the first N columns of U (the left singular vectors, stored columnwise); otherwise, A is overwritten with the first M rows of V**T (the right singular vectors, stored rowwise). if JOBZ != MagmaOverwriteVec, the contents of A are destroyed. [in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] s DOUBLE PRECISION array, dimension (min(M,N)) The singular values of A, sorted so that S(i) >= S(i + 1). [out] U DOUBLE PRECISION array, dimension (LDU,UCOL) UCOL = M if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N; UCOL = min(M,N) if JOBZ = MagmaSomeVec. If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, U contains the M-by-M orthogonal matrix U; if JOBZ = MagmaSomeVec, U contains the first min(M,N) columns of U (the left singular vectors, stored columnwise); if JOBZ = MagmaOverwriteVec and M >= N, or JOBZ = MagmaNoVec, U is not referenced. [in] ldu INTEGER The leading dimension of the array U. LDU >= 1; if JOBZ = MagmaSomeVec or MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, LDU >= M. [out] VT DOUBLE PRECISION array, dimension (LDVT,N) If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, VT contains the N-by-N orthogonal matrix V**T; if JOBZ = MagmaSomeVec, VT contains the first min(M,N) rows of V**T (the right singular vectors, stored rowwise); if JOBZ = MagmaOverwriteVec and M < N, or JOBZ = MagmaNoVec, VT is not referenced. [in] ldvt INTEGER The leading dimension of the array VT. LDVT >= 1; if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, LDVT >= N; if JOBZ = MagmaSomeVec, LDVT >= min(M,N). [out] work (workspace) DOUBLE PRECISION array, dimension (MAX(1,lwork)) On exit, if INFO = 0, WORK[0] returns the optimal lwork. [in] lwork INTEGER The dimension of the array WORK. If lwork = -1, a workspace query is assumed. The optimal size for the WORK array is calculated and stored in WORK[0], and no other work except argument checking is performed. Let mx = max(M,N) and mn = min(M,N). The threshold for mx >> mn is currently mx >= mn*11/6. For job: N=None, O=Overwrite, S=Some, A=All. Because of varying nb for different subroutines, formulas below are an upper bound. Querying gives an exact number. The optimal block size nb can be obtained through magma_get_dgesvd_nb(M,N). Optimal lwork (required in MAGMA) for mx >> mn: Path 1: jobz=N 3*mn + 2*mn*nb Path 2: jobz=O 2*mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) or mx*mn + mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) [marginally faster?] Path 3: jobz=S mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) Path 4: jobz=A mn*mn + max( 3*mn*mn + 7*mn, 3*mn + 2*mn*nb, mn + mx*nb ) for mx >= mn, but not mx >> mn: Path 5: jobz=N 3*mn + max( 4*mn, (mx + mn)*nb ) jobz=O mx*mn + 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) [faster algorithm] or mn*mn + 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) [slower algorithm] jobz=S 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) jobz=A 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) MAGMA requires the optimal sizes above, while LAPACK has the same optimal sizes but the minimum sizes below. LAPACK minimum lwork for mx >> mn: Path 1: jobz=N 8*mn Path 2: jobz=O 2*mn*mn + 3*mn + (3*mn*mn + 4*mn) Path 3: jobz=S mn*mn + 3*mn + (3*mn*mn + 4*mn) Path 4: jobz=A mn*mn + 2*mn + mx + (3*mn*mn + 4*mn) # LAPACK's overestimate or mn*mn + max( 3*mn*mn + 7*mn, mn + mx ) # correct minimum for mx >= mn, but not mx >> mn: Path 5: jobz=N 3*mn + max( 7*mn, mx ) jobz=O 3*mn + max( 4*mn*mn + 4*mn, mx ) jobz=S 3*mn + max( 3*mn*mn + 4*mn, mx ) jobz=A 3*mn + max( 3*mn*mn + 4*mn, mx ) iwork (workspace) INTEGER array, dimension (8*min(M,N)) [out] info INTEGER = 0: successful exit. < 0: if INFO = -i, the i-th argument had an illegal value. > 0: The updating process of DBDSDC did not converge.

## Further Details

Based on contributions by Ming Gu and Huan Ren, Computer Science Division, University of California at Berkeley, USA

 magma_int_t magma_sgesdd ( magma_vec_t jobz, magma_int_t m, magma_int_t n, float * A, magma_int_t lda, float * s, float * U, magma_int_t ldu, float * VT, magma_int_t ldvt, float * work, magma_int_t lwork, magma_int_t * iwork, magma_int_t * info )

SGESDD computes the singular value decomposition (SVD) of a real M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method.

The SVD is written

A = U * SIGMA * transpose(V)

where SIGMA is an M-by-N matrix which is zero except for its min(m,n) diagonal elements, U is an M-by-M orthogonal matrix, and V is an N-by-N orthogonal matrix. The diagonal elements of SIGMA are the singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.

Note that the routine returns VT = V**T, not V.

The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work on machines with a guard digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.

Parameters
 [in] jobz magma_vec_t Specifies options for computing all or part of the matrix U: = MagmaAllVec: all M columns of U and all N rows of V**T are returned in the arrays U and VT; = MagmaSomeVec: the first min(M,N) columns of U and the first min(M,N) rows of V**T are returned in the arrays U and VT; = MagmaOverwriteVec: If M >= N, the first N columns of U are overwritten on the array A and all rows of V**T are returned in the array VT; otherwise, all columns of U are returned in the array U and the first M rows of V**T are overwritten on the array A; = MagmaNoVec: no columns of U or rows of V**T are computed. [in] m INTEGER The number of rows of the input matrix A. M >= 0. [in] n INTEGER The number of columns of the input matrix A. N >= 0. [in,out] A REAL array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, if JOBZ = MagmaOverwriteVec, if M >= N, A is overwritten with the first N columns of U (the left singular vectors, stored columnwise); otherwise, A is overwritten with the first M rows of V**T (the right singular vectors, stored rowwise). if JOBZ != MagmaOverwriteVec, the contents of A are destroyed. [in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] s REAL array, dimension (min(M,N)) The singular values of A, sorted so that S(i) >= S(i + 1). [out] U REAL array, dimension (LDU,UCOL) UCOL = M if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N; UCOL = min(M,N) if JOBZ = MagmaSomeVec. If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, U contains the M-by-M orthogonal matrix U; if JOBZ = MagmaSomeVec, U contains the first min(M,N) columns of U (the left singular vectors, stored columnwise); if JOBZ = MagmaOverwriteVec and M >= N, or JOBZ = MagmaNoVec, U is not referenced. [in] ldu INTEGER The leading dimension of the array U. LDU >= 1; if JOBZ = MagmaSomeVec or MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, LDU >= M. [out] VT REAL array, dimension (LDVT,N) If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, VT contains the N-by-N orthogonal matrix V**T; if JOBZ = MagmaSomeVec, VT contains the first min(M,N) rows of V**T (the right singular vectors, stored rowwise); if JOBZ = MagmaOverwriteVec and M < N, or JOBZ = MagmaNoVec, VT is not referenced. [in] ldvt INTEGER The leading dimension of the array VT. LDVT >= 1; if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, LDVT >= N; if JOBZ = MagmaSomeVec, LDVT >= min(M,N). [out] work (workspace) REAL array, dimension (MAX(1,lwork)) On exit, if INFO = 0, WORK[0] returns the optimal lwork. [in] lwork INTEGER The dimension of the array WORK. If lwork = -1, a workspace query is assumed. The optimal size for the WORK array is calculated and stored in WORK[0], and no other work except argument checking is performed. Let mx = max(M,N) and mn = min(M,N). The threshold for mx >> mn is currently mx >= mn*11/6. For job: N=None, O=Overwrite, S=Some, A=All. Because of varying nb for different subroutines, formulas below are an upper bound. Querying gives an exact number. The optimal block size nb can be obtained through magma_get_sgesvd_nb(M,N). Optimal lwork (required in MAGMA) for mx >> mn: Path 1: jobz=N 3*mn + 2*mn*nb Path 2: jobz=O 2*mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) or mx*mn + mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) [marginally faster?] Path 3: jobz=S mn*mn + 3*mn + max( 3*mn*mn + 4*mn, 2*mn*nb ) Path 4: jobz=A mn*mn + max( 3*mn*mn + 7*mn, 3*mn + 2*mn*nb, mn + mx*nb ) for mx >= mn, but not mx >> mn: Path 5: jobz=N 3*mn + max( 4*mn, (mx + mn)*nb ) jobz=O mx*mn + 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) [faster algorithm] or mn*mn + 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) [slower algorithm] jobz=S 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) jobz=A 3*mn + max( 3*mn*mn + 4*mn, (mx + mn)*nb ) MAGMA requires the optimal sizes above, while LAPACK has the same optimal sizes but the minimum sizes below. LAPACK minimum lwork for mx >> mn: Path 1: jobz=N 8*mn Path 2: jobz=O 2*mn*mn + 3*mn + (3*mn*mn + 4*mn) Path 3: jobz=S mn*mn + 3*mn + (3*mn*mn + 4*mn) Path 4: jobz=A mn*mn + 2*mn + mx + (3*mn*mn + 4*mn) # LAPACK's overestimate or mn*mn + max( 3*mn*mn + 7*mn, mn + mx ) # correct minimum for mx >= mn, but not mx >> mn: Path 5: jobz=N 3*mn + max( 7*mn, mx ) jobz=O 3*mn + max( 4*mn*mn + 4*mn, mx ) jobz=S 3*mn + max( 3*mn*mn + 4*mn, mx ) jobz=A 3*mn + max( 3*mn*mn + 4*mn, mx ) iwork (workspace) INTEGER array, dimension (8*min(M,N)) [out] info INTEGER = 0: successful exit. < 0: if INFO = -i, the i-th argument had an illegal value. > 0: The updating process of SBDSDC did not converge.

## Further Details

Based on contributions by Ming Gu and Huan Ren, Computer Science Division, University of California at Berkeley, USA

 magma_int_t magma_zgesdd ( magma_vec_t jobz, magma_int_t m, magma_int_t n, magmaDoubleComplex * A, magma_int_t lda, double * s, magmaDoubleComplex * U, magma_int_t ldu, magmaDoubleComplex * VT, magma_int_t ldvt, magmaDoubleComplex * work, magma_int_t lwork, double * rwork, magma_int_t * iwork, magma_int_t * info )

ZGESDD computes the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and right singular vectors, by using divide-and-conquer method.

The SVD is written

A = U * SIGMA * conjugate-transpose(V)

where SIGMA is an M-by-N matrix which is zero except for its min(m,n) diagonal elements, U is an M-by-M unitary matrix, and V is an N-by-N unitary matrix. The diagonal elements of SIGMA are the singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.

Note that the routine returns VT = V**H, not V.

The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work on machines with a guard digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.

Parameters
 [in] jobz magma_vec_t Specifies options for computing all or part of the matrix U: = MagmaAllVec: all M columns of U and all N rows of V**H are returned in the arrays U and VT; = MagmaSomeVec: the first min(M,N) columns of U and the first min(M,N) rows of V**H are returned in the arrays U and VT; = MagmaOverwriteVec: If M >= N, the first N columns of U are overwritten on the array A and all rows of V**H are returned in the array VT; otherwise, all columns of U are returned in the array U and the first M rows of V**H are overwritten on the array A; = MagmaNoVec: no columns of U or rows of V**H are computed. [in] m INTEGER The number of rows of the input matrix A. M >= 0. [in] n INTEGER The number of columns of the input matrix A. N >= 0. [in,out] A COMPLEX_16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, if JOBZ = MagmaOverwriteVec, if M >= N, A is overwritten with the first N columns of U (the left singular vectors, stored columnwise); otherwise, A is overwritten with the first M rows of V**H (the right singular vectors, stored rowwise). if JOBZ != MagmaOverwriteVec, the contents of A are destroyed. [in] lda INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] s DOUBLE PRECISION array, dimension (min(M,N)) The singular values of A, sorted so that S(i) >= S(i+1). [out] U COMPLEX_16 array, dimension (LDU,UCOL) UCOL = M if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N; UCOL = min(M,N) if JOBZ = MagmaSomeVec. If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, U contains the M-by-M unitary matrix U; if JOBZ = MagmaSomeVec, U contains the first min(M,N) columns of U (the left singular vectors, stored columnwise); if JOBZ = MagmaOverwriteVec and M >= N, or JOBZ = MagmaNoVec, U is not referenced. [in] ldu INTEGER The leading dimension of the array U. LDU >= 1; if JOBZ = MagmaSomeVec or MagmaAllVec or JOBZ = MagmaOverwriteVec and M < N, LDU >= M. [out] VT COMPLEX_16 array, dimension (LDVT,N) If JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, VT contains the N-by-N unitary matrix V**H; if JOBZ = MagmaSomeVec, VT contains the first min(M,N) rows of V**H (the right singular vectors, stored rowwise); if JOBZ = MagmaOverwriteVec and M < N, or JOBZ = MagmaNoVec, VT is not referenced. [in] ldvt INTEGER The leading dimension of the array VT. LDVT >= 1; if JOBZ = MagmaAllVec or JOBZ = MagmaOverwriteVec and M >= N, LDVT >= N; if JOBZ = MagmaSomeVec, LDVT >= min(M,N). [out] work (workspace) COMPLEX_16 array, dimension (MAX(1,lwork)) On exit, if INFO = 0, WORK[0] returns the optimal lwork. [in] lwork INTEGER The dimension of the array WORK. If lwork = -1, a workspace query is assumed. The optimal size for the WORK array is calculated and stored in WORK[0], and no other work except argument checking is performed. Let mx = max(M,N) and mn = min(M,N). The threshold for mx >> mn is currently mx >= mn*17/9. For job: N=None, O=Overwrite, S=Some, A=All. Because of varying nb for different subroutines, formulas below are an upper bound. Querying gives an exact number. The optimal block size nb can be obtained through magma_get_zgesvd_nb(M,N). Optimal lwork (required in MAGMA) for mx >> mn: Path 1: jobz=N 2*mn + 2*mn*nb Path 2: jobz=O 2*mn*mn + 2*mn + 2*mn*nb or mx*mn + mn*mn + 2*mn + 2*mn*nb [marginally faster?] Path 3: jobz=S mn*mn + 2*mn + 2*mn*nb Path 4: jobz=A mn*mn + max( 2*mn + 2*mn*nb, mn + mx*nb ) for mx >= mn, but not mx >> mn: Path 5,6: jobz=N 2*mn + (mx + mn)*nb jobz=O mx*mn + 2*mn + (mx + mn)*nb [faster algorithm] or mn*mn + 2*mn + (mx + mn)*nb [slower algorithm] jobz=S 2*mn + (mx + mn)*nb jobz=A 2*mn + (mx + mn)*nb MAGMA requires the optimal sizes above, while LAPACK has the same optimal sizes but the minimum sizes below. LAPACK minimum lwork for mx >> mn: Path 1: jobz=N 3*mn Path 2: jobz=O 2*mn*mn + 3*mn Path 3: jobz=S mn*mn + 3*mn Path 4: jobz=A mn*mn + 2*mn + mx # LAPACK's overestimate or mn*mn + max( m + n, 3*n ) # correct minimum for mx >= mn, but not mx >> mn: Path 5,6: jobz=N 2*mn + mx jobz=O mn*mn + 2*mn + mx jobz=S 2*mn + mx jobz=A 2*mn + mx rwork (workspace) DOUBLE PRECISION array, dimension (MAX(1,LRWORK)) Let mx = max(M,N) and mn = min(M,N). These sizes should work for both MAGMA and LAPACK. If JOBZ = MagmaNoVec, LRWORK >= 5*mn. # LAPACK <= 3.6 had bug requiring 7*mn If JOBZ != MagmaNoVec, if mx >> mn, LRWORK >= 5*mn*mn + 5*mn; otherwise, LRWORK >= max( 5*mn*mn + 5*mn, 2*mx*mn + 2*mn*mn + mn ). For JOBZ = MagmaNoVec, some implementations seem to have a bug requiring LRWORK >= 7*mn in some cases. iwork (workspace) INTEGER array, dimension (8*min(M,N)) [out] info INTEGER = 0: successful exit. < 0: if INFO = -i, the i-th argument had an illegal value. > 0: The updating process of DBDSDC did not converge.

## Further Details

Based on contributions by Ming Gu and Huan Ren, Computer Science Division, University of California at Berkeley, USA