PLASMA  2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Groups
core_zparfb.c File Reference
#include <cblas.h>
#include <lapacke.h>
#include "common.h"
Include dependency graph for core_zparfb.c:

Go to the source code of this file.

Functions

int CORE_zparfb (int side, int trans, int direct, int storev, int M1, int N1, int M2, int N2, int K, int L, PLASMA_Complex64_t *A1, int LDA1, PLASMA_Complex64_t *A2, int LDA2, PLASMA_Complex64_t *V, int LDV, PLASMA_Complex64_t *T, int LDT, PLASMA_Complex64_t *WORK, int LDWORK)

Detailed Description

PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver

Version:
2.4.5
Author:
Dulceneia Becker
Date:
2011-06-14 normal z -> c d s

Definition in file core_zparfb.c.


Function Documentation

int CORE_zparfb ( int  side,
int  trans,
int  direct,
int  storev,
int  M1,
int  N1,
int  M2,
int  N2,
int  K,
int  L,
PLASMA_Complex64_t A1,
int  LDA1,
PLASMA_Complex64_t A2,
int  LDA2,
PLASMA_Complex64_t V,
int  LDV,
PLASMA_Complex64_t T,
int  LDT,
PLASMA_Complex64_t WORK,
int  LDWORK 
)

CORE_zparfb applies a complex upper triangular block reflector H or its transpose H' to a complex rectangular matrix formed by coupling two tiles A1 and A2. Matrix V is:

    COLUMNWISE                    ROWWISE

   |     K     |                 |      N2-L     |   L  |
__ _____________ __           __ _________________        __
   |    |      |                 |               | \
   |    |      |                 |               |   \    L

M2-L | | | K |_______________|_____\ __ | | | M2 | | __ |____| | | | K-L \ | | __ |______________________| __ L \ | | __ \|______| __ | N2 |

| L | K-L |

Parameters:
[in]side
  • PlasmaLeft : apply Q or Q**H from the Left;
  • PlasmaRight : apply Q or Q**H from the Right.
[in]trans
  • PlasmaNoTrans : No transpose, apply Q;
  • PlasmaConjTrans : ConjTranspose, apply Q**H.
[in]directIndicates how H is formed from a product of elementary reflectors
  • PlasmaForward : H = H(1) H(2) . . . H(k) (Forward)
  • PlasmaBackward : H = H(k) . . . H(2) H(1) (Backward)
[in]storevIndicates how the vectors which define the elementary reflectors are stored:
  • PlasmaColumnwise
  • PlasmaRowwise
[in]M1The number of columns of the tile A1. M1 >= 0.
[in]N1The number of rows of the tile A1. N1 >= 0.
[in]M2The number of columns of the tile A2. M2 >= 0.
[in]N2The number of rows of the tile A2. N2 >= 0.
[in]KThe order of the matrix T (= the number of elementary reflectors whose product defines the block reflector).
[in]LThe size of the triangular part of V
[in,out]A1On entry, the M1-by-N1 tile A1. On exit, A1 is overwritten by the application of Q.
[in]LDA1The leading dimension of the array A1. LDA1 >= max(1,N1).
[in,out]A2On entry, the M2-by-N2 tile A2. On exit, A2 is overwritten by the application of Q.
[in]LDA2The leading dimension of the tile A2. LDA2 >= max(1,N2).
[in]V(LDV,K) if STOREV = 'C' (LDV,M2) if STOREV = 'R' and SIDE = 'L' (LDV,N2) if STOREV = 'R' and SIDE = 'R' Matrix V.
[in]LDVThe leading dimension of the array V. If STOREV = 'C' and SIDE = 'L', LDV >= max(1,M2); if STOREV = 'C' and SIDE = 'R', LDV >= max(1,N2); if STOREV = 'R', LDV >= K.
[out]TThe triangular K-by-K matrix T in the representation of the block reflector. T is upper triangular by block (economic storage); The rest of the array is not referenced.
[in]LDTThe leading dimension of the array T. LDT >= K.
[in,out]WORK
[in]LDWORKThe dimension of the array WORK.
Returns:
Return values:
PLASMA_SUCCESSsuccessful exit
<0if -i, the i-th argument had an illegal value

Definition at line 131 of file core_zparfb.c.

References CBLAS_SADDR, cblas_zaxpy(), cblas_ztrmm(), CblasColMajor, CblasLeft, CblasNonUnit, CblasRight, CblasUpper, CORE_zpamm(), coreblas_error, PLASMA_ERR_NOT_SUPPORTED, PLASMA_SUCCESS, PlasmaA2, PlasmaBackward, PlasmaColumnwise, PlasmaConjTrans, PlasmaForward, PlasmaLeft, PlasmaNoTrans, PlasmaRight, PlasmaRowwise, and PlasmaW.

{
static PLASMA_Complex64_t zone = 1.0;
static PLASMA_Complex64_t mzone = -1.0;
int j;
/* Check input arguments */
if ((side != PlasmaLeft) && (side != PlasmaRight)) {
coreblas_error(1, "Illegal value of side");
return -1;
}
coreblas_error(2, "Illegal value of trans");
return -2;
}
if ((direct != PlasmaForward) && (direct != PlasmaBackward)) {
coreblas_error(3, "Illegal value of direct");
return -3;
}
coreblas_error(4, "Illegal value of storev");
return -4;
}
if (M1 < 0) {
coreblas_error(5, "Illegal value of M1");
return -5;
}
if (N1 < 0) {
coreblas_error(6, "Illegal value of N1");
return -6;
}
if ((M2 < 0) ||
( (side == PlasmaRight) && (M1 != M2) ) ) {
coreblas_error(7, "Illegal value of M2");
return -7;
}
if ((N2 < 0) ||
( (side == PlasmaLeft) && (N1 != N2) ) ) {
coreblas_error(8, "Illegal value of N2");
return -8;
}
if (K < 0) {
coreblas_error(9, "Illegal value of K");
return -9;
}
/* Quick return */
if ((M1 == 0) || (N1 == 0) || (M2 == 0) || (N2 == 0) || (K == 0))
if (direct == PlasmaForward) {
if (side == PlasmaLeft) {
/*
* Column or Rowwise / Forward / Left
* ----------------------------------
*
* Form H * A or H' * A where A = ( A1 )
* ( A2 )
*/
/* W = A1 + op(V) * A2 */
K, N1, M2, L,
A1, LDA1,
A2, LDA2,
V, LDV,
WORK, LDWORK);
/* W = op(T) * W */
CBLAS_SADDR(zone), T, LDT, WORK, LDWORK);
/* A1 = A1 - W */
for(j = 0; j < N1; j++) {
K, CBLAS_SADDR(mzone),
&WORK[LDWORK*j], 1,
&A1[LDA1*j], 1);
}
/* A2 = A2 - op(V) * W */
/* W also changes: W = V * W, A2 = A2 - W */
M2, N2, K, L,
A1, LDA1,
A2, LDA2,
V, LDV,
WORK, LDWORK);
}
else {
/*
* Column or Rowwise / Forward / Right
* -----------------------------------
*
* Form H * A or H' * A where A = ( A1 A2 )
*
*/
/* W = A1 + A2 * op(V) */
M1, K, N2, L,
A1, LDA1,
A2, LDA2,
V, LDV,
WORK, LDWORK);
/* W = W * op(T) */
(CBLAS_TRANSPOSE)trans, CblasNonUnit, M2, K,
CBLAS_SADDR(zone), T, LDT, WORK, LDWORK);
/* A1 = A1 - W */
for(j = 0; j < K; j++) {
M1, CBLAS_SADDR(mzone),
&WORK[LDWORK*j], 1,
&A1[LDA1*j], 1);
}
/* A2 = A2 - W * op(V) */
/* W also changes: W = W * V', A2 = A2 - W */
M2, N2, K, L,
A1, LDA1,
A2, LDA2,
V, LDV,
WORK, LDWORK);
}
}
else {
coreblas_error(3, "Not implemented (Backward / Left or Right)");
}
}

Here is the call graph for this function:

Here is the caller graph for this function: