MAGMA  2.3.0 Matrix Algebra for GPU and Multicore Architectures

Functions

magma_int_t magma_dbajac_csr (magma_int_t localiters, magma_d_matrix D, magma_d_matrix R, magma_d_matrix b, magma_d_matrix *x, magma_queue_t queue)
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. More...

magma_int_t magma_dbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_d_matrix *D, magma_d_matrix *R, magma_d_matrix b, magma_d_matrix *x, magma_queue_t queue)
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. More...

magma_int_t magma_dcompact (magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dnorms, double tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue)
ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol. More...

magma_int_t magma_dmprepare_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue)
This routine prepares the batch of small triangular systems that need to be solved for computing the ISAI preconditioner. More...

magma_int_t magma_djacobisetup_vector_gpu (magma_int_t num_rows, magma_d_matrix b, magma_d_matrix d, magma_d_matrix c, magma_d_matrix *x, magma_queue_t queue)
Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. More...

magma_int_t magma_dlobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaDouble_ptr X, magmaDouble_ptr Y, magma_queue_t queue)
This routine computes a axpy for a mxn matrix: More...

magma_int_t magma_dbicgstab_1 (magma_int_t num_rows, magma_int_t num_cols, double beta, double omega, magmaDouble_ptr r, magmaDouble_ptr v, magmaDouble_ptr p, magma_queue_t queue)
Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Merges the first SpmV using CSR with the dot product and the computation of alpha. More...

Merges the second SpmV using CSR with the dot product and the computation of omega. More...

Merges the second SpmV using CSR with the dot product and the computation of omega. More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

magma_int_t magma_dbicgmerge4 (magma_int_t type, magmaDouble_ptr skp, magma_queue_t queue)
Performs some parameter operations for the BiCGSTAB with scalars on GPU. More...

Mergels multiple operations into one kernel: More...

Merges the first SpmV using different formats with the dot product and the computation of rho. More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

magma_int_t magma_didr_smoothing_2 (magma_int_t num_rows, magma_int_t num_cols, double omega, magmaDouble_ptr dx, magmaDouble_ptr dxs, magma_queue_t queue)
Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

Merges multiple operations into one kernel: More...

Mergels multiple operations into one kernel: More...

magma_int_t magma_dparic_csr (magma_d_matrix A, magma_d_matrix A_CSR, magma_queue_t queue)
This routine iteratively computes an incomplete Cholesky factorization. More...

magma_int_t magma_dparilu_csr (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_queue_t queue)
This routine iteratively computes an incomplete LU factorization. More...

 magma_int_t magma_dbajac_csr ( magma_int_t localiters, magma_d_matrix D, magma_d_matrix R, magma_d_matrix b, magma_d_matrix * x, magma_queue_t queue )

This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
 [in] localiters magma_int_t number of local Jacobi-like updates [in] D magma_d_matrix input matrix with diagonal blocks [in] R magma_d_matrix input matrix with non-diagonal parts [in] b magma_d_matrix RHS [in] x magma_d_matrix* iterate/solution [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dbajac_csr_overlap ( magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_d_matrix * D, magma_d_matrix * R, magma_d_matrix b, magma_d_matrix * x, magma_queue_t queue )

This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
 [in] localiters magma_int_t number of local Jacobi-like updates [in] matrices magma_int_t number of sub-matrices [in] overlap magma_int_t size of the overlap [in] D magma_d_matrix* set of matrices with diagonal blocks [in] R magma_d_matrix* set of matrices with non-diagonal parts [in] b magma_d_matrix RHS [in] x magma_d_matrix* iterate/solution [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dcompact ( magma_int_t m, magma_int_t n, magmaDouble_ptr dA, magma_int_t ldda, magmaDouble_ptr dnorms, double tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue )

ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.

The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.

Parameters
 [in] m INTEGER The number of rows of the matrix dA. M >= 0. [in] n INTEGER The number of columns of the matrix dA. N >= 0. [in,out] dA COMPLEX DOUBLE PRECISION array, dimension (LDDA,N) The m by n matrix dA. [in] ldda INTEGER The leading dimension of the array dA. LDDA >= max(1,M). [in] dnorms DOUBLE PRECISION array, dimension N The norms of the N vectors in dA [in] tol DOUBLE PRECISON The tolerance value used in the criteria to compact or not. [in,out] active INTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed [in,out] cBlock magmaInt_ptr The number of vectors that remain in dA (i.e., with norms > tol). [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dmprepare_batched_gpu ( magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t * sizes, magma_index_t * locations, double * trisystems, double * rhs, magma_queue_t queue )

This routine prepares the batch of small triangular systems that need to be solved for computing the ISAI preconditioner.

Parameters
 [in] uplotype magma_uplo_t input matrix [in] transtype magma_trans_t input matrix [in] diagtype magma_diag_t input matrix [in] L magma_d_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. [in] LC magma_d_matrix sparsity pattern of the ISAI matrix. Col-Major CSR storage. [in,out] sizes magma_index_t* array containing the sizes of the small triangular systems [in,out] locations magma_index_t* array containing the locations in the respective column of L [in,out] trisystems double* batch of generated small triangular systems. All systems are embedded in uniform memory blocks of size BLOCKSIZE x BLOCKSIZE [in,out] rhs double* RHS of the small triangular systems [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_djacobisetup_vector_gpu ( magma_int_t num_rows, magma_d_matrix b, magma_d_matrix d, magma_d_matrix c, magma_d_matrix * x, magma_queue_t queue )

Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.

Returns the vector c. It calls a GPU kernel

Parameters
 [in] num_rows magma_int_t number of rows [in] b magma_d_matrix RHS b [in] d magma_d_matrix vector with diagonal entries [out] c magma_d_matrix* c = D^(-1) * b [out] x magma_d_matrix* iteration vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dlobpcg_maxpy ( magma_int_t num_rows, magma_int_t num_vecs, magmaDouble_ptr X, magmaDouble_ptr Y, magma_queue_t queue )

This routine computes a axpy for a mxn matrix:

Y = X + Y


It replaces: magma_daxpy(m*n, c_one, Y, 1, X, 1);

/ x1[0] x2[0] x3[0] \
| x1[1] x2[1] x3[1] |


X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /

Parameters
 [in] num_rows magma_int_t number of rows [in] num_vecs magma_int_t number of vectors [in] X magmaDouble_ptr input vector X [in,out] Y magmaDouble_ptr input/output vector Y [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dbicgstab_1 ( magma_int_t num_rows, magma_int_t num_cols, double beta, double omega, magmaDouble_ptr r, magmaDouble_ptr v, magmaDouble_ptr p, magma_queue_t queue )

Mergels multiple operations into one kernel:

p = r + beta * ( p - omega * v )

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] omega double scalar [in] r magmaDouble_ptr vector [in] v magmaDouble_ptr vector [in,out] p magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dbicgstab_2 ( magma_int_t num_rows, magma_int_t num_cols, double alpha, magmaDouble_ptr r, magmaDouble_ptr v, magmaDouble_ptr s, magma_queue_t queue )

Mergels multiple operations into one kernel:

s = r - alpha v

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] r magmaDouble_ptr vector [in] v magmaDouble_ptr vector [in,out] s magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

x = x + alpha * p + omega * s r = s - omega * t

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] omega double scalar [in] p magmaDouble_ptr vector [in] s magmaDouble_ptr vector [in] t magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

x = x + alpha * y + omega * z r = s - omega * t

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] omega double scalar [in] y magmaDouble_ptr vector [in] z magmaDouble_ptr vector [in] s magmaDouble_ptr vector [in] t magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Merges the first SpmV using CSR with the dot product and the computation of alpha.

Parameters
 [in] A magma_d_matrix system matrix [in] d1 magmaDouble_ptr temporary vector [in] d2 magmaDouble_ptr temporary vector [in] dp magmaDouble_ptr input vector p [in] dr magmaDouble_ptr input vector r [in] dv magmaDouble_ptr output vector v [in,out] skp magmaDouble_ptr array for parameters ( skp[0]=alpha ) [in] queue magma_queue_t Queue to execute in.

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
 [in] A magma_d_matrix input matrix [in] d1 magmaDouble_ptr temporary vector [in] d2 magmaDouble_ptr temporary vector [in] ds magmaDouble_ptr input vector s [in] dt magmaDouble_ptr output vector t [in,out] skp magmaDouble_ptr array for parameters [in] queue magma_queue_t Queue to execute in.

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
 [in] n int dimension n [in] d1 magmaDouble_ptr temporary vector [in] d2 magmaDouble_ptr temporary vector [in] rr magmaDouble_ptr input vector rr [in] r magmaDouble_ptr input/output vector r [in] p magmaDouble_ptr input vector p [in] s magmaDouble_ptr input vector s [in] t magmaDouble_ptr input vector t [out] x magmaDouble_ptr output vector x [in] skp magmaDouble_ptr array for parameters [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

p = beta*p p = p-omega*beta*v p = p+r

-> p = r + beta * ( p - omega * v )

Parameters
 [in] n int dimension n [in] skp magmaDouble_ptr set of scalar parameters [in] v magmaDouble_ptr input vector v [in] r magmaDouble_ptr input vector r [in,out] p magmaDouble_ptr input/output vector p [in] queue magma_queue_t queue to execute in.

Mergels multiple operations into one kernel:

s=r s=s-alpha*v

-> s = r - alpha * v

Parameters
 [in] n int dimension n [in] skp magmaDouble_ptr set of scalar parameters [in] r magmaDouble_ptr input vector r [in] v magmaDouble_ptr input vector v [out] s magmaDouble_ptr output vector s [in] queue magma_queue_t queue to execute in.

Mergels multiple operations into one kernel:

x=x+alpha*p x=x+omega*s r=s r=r-omega*t

-> x = x + alpha * p + omega * s -> r = s - omega * t

Parameters
 [in] n int dimension n [in] skp magmaDouble_ptr set of scalar parameters [in] p magmaDouble_ptr input p [in] s magmaDouble_ptr input s [in] t magmaDouble_ptr input t [in,out] x magmaDouble_ptr input/output x [in,out] r magmaDouble_ptr input/output r [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dbicgmerge4 ( magma_int_t type, magmaDouble_ptr skp, magma_queue_t queue )

Performs some parameter operations for the BiCGSTAB with scalars on GPU.

Parameters
 [in] type int kernel type [in,out] skp magmaDouble_ptr vector with parameters [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha magmaDouble_ptr matrix containing all SKP [in] p magmaDouble_ptr search directions [in,out] x magmaDouble_ptr approximation vector [in] queue magma_queue_t Queue to execute in.

Merges the first SpmV using different formats with the dot product and the computation of rho.

Parameters
 [in] A magma_d_matrix input matrix [in] d1 magmaDouble_ptr temporary vector [in] d2 magmaDouble_ptr temporary vector [in] dd magmaDouble_ptr input vector d [out] dz magmaDouble_ptr input vector z [out] skp magmaDouble_ptr array for parameters ( skp[3]=rho ) [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

u = r + beta q p = u + beta*(q + beta*p)

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] r magmaDouble_ptr vector [in] q magmaDouble_ptr vector [in,out] u magmaDouble_ptr vector [in,out] p magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

u = r p = r

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] r magmaDouble_ptr vector [in,out] u magmaDouble_ptr vector [in,out] p magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

q = u - alpha v_hat t = u + q

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] v_hat magmaDouble_ptr vector [in] u magmaDouble_ptr vector [in,out] q magmaDouble_ptr vector [in,out] t magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

x = x + alpha u_hat r = r -alpha*A u_hat = r -alpha*t

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] u_hat magmaDouble_ptr vector [in] t magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

dt = drs - dr

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] drs magmaDouble_ptr vector [in] dr magmaDouble_ptr vector [in,out] dt magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_didr_smoothing_2 ( magma_int_t num_rows, magma_int_t num_cols, double omega, magmaDouble_ptr dx, magmaDouble_ptr dxs, magma_queue_t queue )

Mergels multiple operations into one kernel:

dxs = dxs - gamma*(dxs-dx)

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] omega double scalar [in] dx magmaDouble_ptr vector [in,out] dxs magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] rho double scalar [in] psi double scalar [in,out] y magmaDouble_ptr vector [in,out] z magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] w magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

p = y - pde * p q = z - rde * q

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] pde double scalar [in] rde double scalar [in] y magmaDouble_ptr vector [in] z magmaDouble_ptr vector [in,out] p magmaDouble_ptr vector [in,out] q magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dqmr_3 ( magma_int_t num_rows, magma_int_t num_cols, double beta, magmaDouble_ptr pt, magmaDouble_ptr v, magmaDouble_ptr y, magma_queue_t queue )

Mergels multiple operations into one kernel:

v = pt - beta * v y = v

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] pt magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] y magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

d = eta * p; s = eta * pt; x = x + d; r = r - s;

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] eta double scalar [in] p magmaDouble_ptr vector [in] pt magmaDouble_ptr vector [in,out] d magmaDouble_ptr vector [in,out] s magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

d = eta * p + pds * d; s = eta * pt + pds * s; x = x + d; r = r - s;

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] eta double scalar [in] pds double scalar [in] p magmaDouble_ptr vector [in] pt magmaDouble_ptr vector [in,out] d magmaDouble_ptr vector [in,out] s magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

wt = wt - conj(beta) * w v = y / rho y = y / rho w = wt / psi z = wt / psi

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] rho double scalar [in] psi double scalar [in,out] y magmaDouble_ptr vector [in,out] z magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] w magmaDouble_ptr vector [in,out] wt magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dqmr_7 ( magma_int_t num_rows, magma_int_t num_cols, double beta, magmaDouble_ptr pt, magmaDouble_ptr v, magmaDouble_ptr vt, magma_queue_t queue )

Mergels multiple operations into one kernel:

vt = pt - beta * v

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] pt magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] vt magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] rho double scalar [in] psi double scalar [in] vt magmaDouble_ptr vector [in] wt magmaDouble_ptr vector [in,out] y magmaDouble_ptr vector [in,out] z magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] w magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

u_mp1 = u_mp1 - alpha*v; w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] sigma double scalar [in] v magmaDouble_ptr vector [in] Au magmaDouble_ptr vector [in,out] u_m magmaDouble_ptr vector [in,out] pu_m magmaDouble_ptr vector [in,out] u_mp1 magmaDouble_ptr vector [in,out] w magmaDouble_ptr vector [in,out] d magmaDouble_ptr vector [in,out] Ad magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

x = x + eta * d r = r - eta * Ad

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] eta double scalar [in] d magmaDouble_ptr vector [in] Ad magmaDouble_ptr vector [in,out] x magmaDouble_ptr vector [in,out] r magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dtfqmr_3 ( magma_int_t num_rows, magma_int_t num_cols, double beta, magmaDouble_ptr w, magmaDouble_ptr u_m, magmaDouble_ptr u_mp1, magma_queue_t queue )

Mergels multiple operations into one kernel:

u_mp1 = w + beta*u_mp1

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] w magmaDouble_ptr vector [in] u_m magmaDouble_ptr vector [in,out] u_mp1 magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dtfqmr_4 ( magma_int_t num_rows, magma_int_t num_cols, double beta, magmaDouble_ptr Au_new, magmaDouble_ptr v, magmaDouble_ptr Au, magma_queue_t queue )

Merges multiple operations into one kernel:

v = Au_new + beta*(Au+beta*v); Au = Au_new

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] beta double scalar [in] Au_new magmaDouble_ptr vector [in,out] v magmaDouble_ptr vector [in,out] Au magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.

Mergels multiple operations into one kernel:

w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
 [in] num_rows magma_int_t dimension m [in] num_cols magma_int_t dimension n [in] alpha double scalar [in] sigma double scalar [in] v magmaDouble_ptr vector [in] Au magmaDouble_ptr vector [in,out] u_mp1 magmaDouble_ptr vector [in,out] w magmaDouble_ptr vector [in,out] d magmaDouble_ptr vector [in,out] Ad magmaDouble_ptr vector [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dparic_csr ( magma_d_matrix A, magma_d_matrix A_CSR, magma_queue_t queue )

This routine iteratively computes an incomplete Cholesky factorization.

The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'

The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.

Parameters
 [in] A magma_d_matrix input matrix A - initial guess (lower triangular) [in,out] A_CSR magma_d_matrix input/output matrix containing the IC approximation [in] queue magma_queue_t Queue to execute in.
 magma_int_t magma_dparilu_csr ( magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_queue_t queue )

This routine iteratively computes an incomplete LU factorization.

The idea is according to Edmond Chow's presentation at SIAM 2014. This routine was used in the ISC 2015 paper: E. Chow et al.: 'Study of an Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs'

The input format of the matrix is Magma_CSRCOO for the upper and lower triangular parts. Note however, that we flip col and rowidx for the U-part. Every component of L and U is handled by one thread.

Parameters
 [in] A magma_d_matrix input matrix A determing initial guess & processing order [in,out] L magma_d_matrix input/output matrix L containing the ILU approximation [in,out] U magma_d_matrix input/output matrix U containing the ILU approximation [in] queue magma_queue_t Queue to execute in.