## matrix vector multiplication

Open discussion regarding features, bugs, issues, vendors, etc.

### matrix vector multiplication

Hi!,
I'm looking for the function that resolve the matrix vector multiplication, meaning that the result should be at this form: x <--Ax which x is a vector and A a matrix.
DGEMV requires the result of the multiplication by a new vector y (x <-- x+ alpha*A*y).
thx !
haydiamet

Posts: 1
Joined: Thu Feb 05, 2015 12:58 pm

### Re: matrix vector multiplication

That is not existing in BLAS.
You have to create it yourself (or use dgemv with another vector).
zerothi

Posts: 16
Joined: Thu Apr 30, 2015 8:42 am

### Re: matrix vector multiplication

zerothi wrote:That is not existing in BLAS.
You have to create it yourself (or use dgemv with another vector).

Hello
How would I create it myself? not sure what you mean by dgemv "with another vector"?
franki

Posts: 1
Joined: Thu Aug 20, 2015 2:33 am

### Re: matrix vector multiplication

Use dgemv, by far this is the easiest, and fastest solution.
zerothi

Posts: 16
Joined: Thu Apr 30, 2015 8:42 am

### Re: matrix vector multiplication

Quick explanation of why there is no subroutine in the BLAS so that x <- A*x
for a dense matrix A.

Observation: We cannot do the operation x <- A*x in place. We have to copy x in a temporaray
vector y and then perform x <- A*y. This is because, once we have computed the output value x(i) (for any i), we cannot reuse the input value x(i)
to compute the other output x(j) (for j not i). So we cannot do x <- A*x in place.

So, this leaves us to two choices to design a BLAS subroutine. (1) Either stick
with the idea that we need to create a subroutine such that x <- A*x. This
means that the BLAS will have to have a routine that allocates a temporary
workspace for y, copies y <- x, then calls the existing and useful x <- A*y,
then frees y. (2) Or we focus on performing x <- A*y and leave the tasks of
allocation, copy, free to user. The choice was (2) over (1) since it is easy
enough to get (1) from (2). The design choice for BLAS was to only implement
(2) and leave (1) to the user.

Comment #1: If A is triangular, then it is possible to implement x <- A*x in
place. If A is upper triangular, you first compute x(1) using x(1), x(2) . . .
x(n). And this is OK since x(1) is not needed to compute x(2) . . . x(n). Then
you compute x(2), then x(3), etc. This explains why the TRMV routine is by

Comment #2: This explanation about GEMV and TRMV holds for GEMM and TRMM. There
is no subroutine to perform A <- A*B in the BLAS. There is only GEMM of the
type A <- B*C. But there is TRMM such that A <- A*B when B is triangular or
A <- B*A when B is triangular. TRMM is in place by design.

Comment #3: If you really really are short in memory and have to implement say
A <- A*B where A is m-by-n, B is n-by-n, and n << m, you can perform the LU
factorization of B: B = P*L*U and then compute A <- (A*P)*L)*U). This way you
do not need to allocate an extra m-by-n workspace for a temporary C. You can
actually do A <- A*B in place. This is an extreme trick. I did use it a few
times in some iterative solvers where memory was scarce. For example, n is of
the order of 10, and m is of the order of 10^9, so the trade off is either
perform the LU factorization of 10-by-10 matrix B or allocate a 10^9-by-10
temporary array. Yes you may worry a little about stability with LU, so you can
use a more stable factorization for B like the QR factorization. That would
work similarly.

Julien
Julien Langou

Posts: 835
Joined: Thu Dec 09, 2004 12:32 pm
Location: Denver, CO, USA