PLASMA
2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
|
Go to the source code of this file.
Functions | |
void | CORE_dtrdalg (PLASMA_enum uplo, int N, int NB, PLASMA_desc *pA, double *V, double *TAU, int i, int j, int m, int grsiz) |
void | QUARK_CORE_dtrdalg (Quark *quark, Quark_Task_Flags *task_flags, int uplo, int N, int NB, PLASMA_desc *A, double *V, double *TAU, int i, int j, int m, int grsiz, int BAND, int *PCOL, int *ACOL, int *MCOL) |
void | CORE_dtrdalg_quark (Quark *quark) |
PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver
Definition in file core_dtrdalg.c.
void CORE_dtrdalg | ( | PLASMA_enum | uplo, |
int | N, | ||
int | NB, | ||
PLASMA_desc * | pA, | ||
double * | V, | ||
double * | TAU, | ||
int | i, | ||
int | j, | ||
int | m, | ||
int | grsiz | ||
) |
CORE_dtrdalg is a part of the tridiagonal reduction algorithm (bulgechasing) It correspond to a local driver of the kernels that should be executed on a single core.
[in] | uplo |
|
[in] | N | The order of the matrix A. N >= 0. |
[in] | NB | The size of the Bandwidth of the matrix A, which correspond to the tile size. NB >= 0. |
[in] | pA | A pointer to the descriptor of the matrix A. |
[out] | V | double array, dimension (N). The scalar elementary reflectors are written in this array. So it is used as a workspace for V at each step of the bulge chasing algorithm. |
[out] | TAU | double array, dimension (N). The scalar factors of the elementary reflectors are written in thisarray. So it is used as a workspace for TAU at each step of the bulge chasing algorithm. |
[in] | i | Integer that refer to the current sweep. (outer loop). |
[in] | j | Integer that refer to the sweep to chase.(inner loop). |
[in] | m | Integer that refer to a sweep step, to ensure order dependencies. |
[in] | grsiz | Integer that refer to the size of a group. group mean the number of kernel that should be executed sequentially on the same core. group size is a trade-off between locality (cache reuse) and parallelism. a small group size increase parallelism while a large group size increase cache reuse. |
PLASMA_SUCCESS | successful exit |
<0 | if -i, the i-th argument had an illegal value |
Definition at line 82 of file core_dtrdalg.c.
References A, CORE_dhbelr(), CORE_dhblrx(), CORE_dhbrce(), plasma_desc_t::dtyp, min, and plasma_element_size().
void CORE_dtrdalg_quark | ( | Quark * | quark | ) |
Definition at line 160 of file core_dtrdalg.c.
References CORE_dtrdalg(), quark_unpack_args_10, TAU, uplo, and V.
void QUARK_CORE_dtrdalg | ( | Quark * | quark, |
Quark_Task_Flags * | task_flags, | ||
int | uplo, | ||
int | N, | ||
int | NB, | ||
PLASMA_desc * | A, | ||
double * | V, | ||
double * | TAU, | ||
int | i, | ||
int | j, | ||
int | m, | ||
int | grsiz, | ||
int | BAND, | ||
int * | PCOL, | ||
int * | ACOL, | ||
int * | MCOL | ||
) |
Definition at line 126 of file core_dtrdalg.c.
References CORE_dtrdalg_quark(), INPUT, LOCALITY, NODEP, OUTPUT, QUARK_Insert_Task(), and VALUE.