PLASMA  2.4.5
PLASMA - Parallel Linear Algebra for Scalable Multi-core Architectures
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Groups
core_dgetrf_rectil.c File Reference
#include <math.h>
#include <cblas.h>
#include <lapacke.h>
#include "common.h"
Include dependency graph for core_dgetrf_rectil.c:

Go to the source code of this file.

Macros

#define A(m, n)   BLKADDR(A, double, m, n)
#define AMAX1BUF_SIZE   (48 << 1)

Functions

void CORE_dgetrf_rectil_init (void)
int CORE_dgetrf_rectil (const PLASMA_desc A, int *IPIV, int *info)
void QUARK_CORE_dgetrf_rectil (Quark *quark, Quark_Task_Flags *task_flags, PLASMA_desc A, double *Amn, int size, int *IPIV, PLASMA_sequence *sequence, PLASMA_request *request, PLASMA_bool check_info, int iinfo, int nbthread)
void CORE_dgetrf_rectil_quark (Quark *quark)

Detailed Description

PLASMA core_blas kernel PLASMA is a software package provided by Univ. of Tennessee, Univ. of California Berkeley and Univ. of Colorado Denver

Version:
2.4.5
Author:
Hatem Ltaief
Mathieu Faverge
Piotr Luszczek
Date:
2009-11-15

d Tue Nov 22 14:35:21 2011

Definition in file core_dgetrf_rectil.c.


Macro Definition Documentation

#define A (   m,
 
)    BLKADDR(A, double, m, n)

Definition at line 24 of file core_dgetrf_rectil.c.

#define AMAX1BUF_SIZE   (48 << 1)

CORE_dgetf2 computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

WARNING: You cannot call this kernel on different matrices at the same time

The factorization has the form

A = P * L * U

where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking LAPACK Level 2 BLAS version of the algorithm.

Parameters:
[in]MThe number of rows of the matrix A. M >= 0.
[in]NThe number of columns of the matrix A. N >= 0.
[in,out]AOn entry, the m by n matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]LDAThe leading dimension of the array A. LDA >= max(1,M).
[out]IPIVThe pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]INFO= k if U(k,k) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
Returns:
Return values:
PLASMA_SUCCESSsuccessful exit
-k,thek-th argument had an illegal value

Definition at line 85 of file core_dgetrf_rectil.c.


Function Documentation

int CORE_dgetrf_rectil ( const PLASMA_desc  A,
int *  IPIV,
int *  info 
)

Definition at line 653 of file core_dgetrf_rectil.c.

References coreblas_error, plasma_desc_t::m, min, plasma_desc_t::mt, plasma_desc_t::n, and plasma_desc_t::nt.

{
int ft, lt;
int thidx = info[1];
int thcnt = min( info[2], A.mt );
int minMN = min( A.m, A.n );
double pivot;
if ( A.nt > 1 ) {
coreblas_error(1, "Illegal value of A.nt");
return -1;
}
if ( thidx >= thcnt )
return 0;
int q = A.mt / thcnt;
int r = A.mt % thcnt;
if (thidx < r) {
q++;
ft = thidx * q;
lt = ft + q;
} else {
ft = r * (q + 1) + (thidx - r) * q;
lt = ft + q;
lt = min( lt, A.mt );
}
info[0] = 0;
CORE_dgetrf_rectil_rec( A, IPIV, info, &pivot,
thidx, thcnt, 0, minMN, ft, lt);
if ( A.n > minMN ) {
CORE_dgetrf_rectil_update( A, IPIV,
0, minMN, A.n-minMN,
thidx, thcnt,
ft, lt);
}
return info[0];
}

Here is the caller graph for this function:

void CORE_dgetrf_rectil_init ( void  )

Definition at line 92 of file core_dgetrf_rectil.c.

References AMAX1BUF_SIZE.

{
int i;
for (i = 0; i < AMAX1BUF_SIZE; ++i) CORE_damax1buf[i] = -1.0;
sfmin = LAPACKE_dlamch_work('S');
}

Here is the caller graph for this function:

void CORE_dgetrf_rectil_quark ( Quark quark)

Definition at line 726 of file core_dgetrf_rectil.c.

References A, CORE_dgetrf_rectil(), IPIV, plasma_sequence_flush(), PLASMA_SUCCESS, QUARK_Get_RankInTask(), and quark_unpack_args_8.

{
double *Amn;
int *IPIV;
PLASMA_sequence *sequence;
PLASMA_request *request;
PLASMA_bool check_info;
int iinfo;
int info[3];
int maxthreads;
quark_unpack_args_8(quark, A, Amn, IPIV, sequence, request,
check_info, iinfo, maxthreads );
info[1] = QUARK_Get_RankInTask(quark);
info[2] = maxthreads;
CORE_dgetrf_rectil( A, IPIV, info );
if (info[1] == 0 && info[0] != PLASMA_SUCCESS && check_info)
plasma_sequence_flush(quark, sequence, request, iinfo + info[0] );
}

Here is the call graph for this function:

Here is the caller graph for this function:

void QUARK_CORE_dgetrf_rectil ( Quark quark,
Quark_Task_Flags task_flags,
PLASMA_desc  A,
double *  Amn,
int  size,
int *  IPIV,
PLASMA_sequence sequence,
PLASMA_request request,
PLASMA_bool  check_info,
int  iinfo,
int  nbthread 
)

Definition at line 699 of file core_dgetrf_rectil.c.

References CORE_dgetrf_rectil_quark(), DAG_CORE_GETRF, INOUT, plasma_desc_t::n, OUTPUT, QUARK_Insert_Task(), and VALUE.

{
sizeof(PLASMA_desc), &A, VALUE,
sizeof(double)*size, Amn, INOUT,
sizeof(int)*A.n, IPIV, OUTPUT,
sizeof(PLASMA_sequence*), &sequence, VALUE,
sizeof(PLASMA_request*), &request, VALUE,
sizeof(PLASMA_bool), &check_info, VALUE,
sizeof(int), &iinfo, VALUE,
sizeof(int), &nbthread, VALUE,
0);
}

Here is the call graph for this function:

Here is the caller graph for this function: