Page **1** of **1**

### Complex version of ?sfrk

Posted:

**Tue Mar 03, 2015 10:46 am**
by **jdegreef**

Hi,

I am looking for a complex version of the rank k update ?sfrk, like dsfrk is available for doubles. I would have assumed this would be available as zsfrk, but it is not.

I did find zhfrk, but this performs a Hermatian operation which is distinctly different from a traditional transpose. I understand that dhfrk does not exist due to the banality of conjugating doubles, but zsfrk (if it existed) would have a distinct mathematical sense and application.

I could simply use the full matrix format zsyrk, but then I miss out on fully packed formats. I just think it's weird that the type-independent uniform interface was not respected for this routine, so I wanted to ask about it. Maybe I am missing something after all.

Moreover, if possible I was looking into a rank 2k update with two different alpha's (no longer making it a rank 2k update but anyway).

C = alpha x A' x B + beta x B' x A + gamma x C

If this formulation is in some way available through other function calls or combinations, I would be very grateful

Jonas

### Re: Complex version of ?sfrk

Posted:

**Fri Mar 20, 2015 1:46 pm**
by **Julien Langou**

Hi Jonas,

I am looking for a complex version of the rank k update ?sfrk, like dsfrk is available for doubles. I would have assumed this would be available as zsfrk, but it is not.

It is great to hear about interest in the RFP format subroutines!

We added RFP in LAPACK 3.2 back in 2008. It was not clear there would be any traction at the time. Initially we planned on only releasing Cholesky in RFP format, so five subroutines: CPFTRF/CPFTRI/CPFTRS/CFTSM/CTFTRI. We ended up needed a bunch of auxilary routines (6 conversion subroutines and 1 subroutine for norm computation). And, since I felt it could be useful, I wrote ZHFRK and DSRFK and added these to LAPACK as well.

I never went down to write ZSRFK. Would that be useful for you? What would be the application? (I can see quite a few, I would like to know your scope of applications.) We can definitely think about adding this subroutine in LAPACK. The thing is that we do not have any support for complex symmetric matrix in RFP format (ZSF does not exist at all!), so it is not clear to me why we would have a stand-alone subroutine that outputs a complex symmetric matrix in RFP format (ZSFxxx). (Since we have nothing for processing its output.) It seems to me like it is not just one subroutine that you need. Can you develop some more? (We do not have ZSTRYF/ZSYTRS/ZSYTRI in RFP format, so why ZSFRK?)

On a similar note, we do not have DSTRYF/DSYTRS/DSYTRI in RFP format either.

Moreover, if possible I was looking into a rank 2k update with two different alpha's (no longer making it a rank 2k update but anyway).

C = alpha x A' x B + beta x B' x A + gamma x C

(1) Why is this not a rank 2k update? It is a rank 2k, isn't it?

(2) Is C Hermitian / Symmetric in input? Is beta the conjugate of alpha? The thing is that if C is Hermitian in input, we need beta, conjugate of alpha, to preserve Hermitian-ity. This is what ZHER2K does.

(3) You can clearly do this with two ZGEMMs. But I guess this is not answering your question.

It would be great if you can follow up on these two items.

Cheers,

Julien.

### Re: Complex version of ?sfrk

Posted:

**Mon Mar 23, 2015 5:39 am**
by **jdegreef**

Hi Julien,

The application is the creation of elementary matrices for a FEM system. Depending on the complexity of the degrees of freedom of the elements you want to solve, we found that -given an increasing complexity- we are rapidly approaching a threshold where LAPACK routines provide a very significant speedup compared to simply performing the matrix operations explicitly.

Due to the nature of our system, these matrices are often complex. Moreover, we compute such matrices in parallel (since they are independent), so we need to be careful about memory. RFP provides us both with a very fast as well as a "as condense as possible" format. In essence, it allows us to compute twice as many elements in parallel on a memory-bound machine.

We do not particularly need other RFP routines, since after the assembly step, the rest of the process is handled by a dedicated solver for these FEM systems.

As for your other questions:

1) I thought it wasn't a rank 2k update, since it requires different scalars, namely alpha and beta. I might be wrong though, but I thought for a rank 2k we needed alpha = beta. In my particular case, alpha = -beta.

2) C is therefore symmetric up to a real sign, so not Hermetian. Since the actual cost is in computing the values (and the sign is a given symmetry aspect), we currently calculate it as if it were a symmetric matrix.

3) ZGEMM has the obvious issue that it computes the full matrix instead of just half.

Jonas

Edit: This forum is flagging this message as spam?

### Re: Complex version of ?sfrk

Posted:

**Wed Mar 25, 2015 12:38 pm**
by **Julien Langou**

[Jonas wrote] we found that -given an increasing complexity- we are rapidly approaching a threshold where LAPACK routines provide a very significant speedup compared to simply performing the matrix operations explicitly.

Out of curiosity: which LAPACK routines are you using?

[Jonas wrote] I thought

C = alpha * A^T * B + beta * B^T * A + gamma * C

wasn't a rank 2k update, since it requires different scalars, namely alpha and beta. I might be wrong though, but I thought for a rank 2k we needed alpha = beta. In my particular case, alpha = -beta.

If C is N-by-N, A and B are K-by-N, and if alpha and beta are scalars, then an update of the C of the form

C = alpha * A^T * B + beta * B^T * A + gamma * C

is at most of rank 2K. The rank of [ alpha * A^T * B ] is at most K. The rank of [ beta * B^T * A ] is at most K. The rank of [ alpha * A^T * B + beta * B^T * A ] is at most 2K.

[Jonas wrote] In my particular case, alpha = -beta.

So here how I understand it.

[ ZSYR2K ] If C is complex symmetric (C = C^T) in input and, if alpha and beta are complex scalars, and if we perform

C = alpha * A^T * B + alpha * B^T * A + gamma * C

then C is complex symmetric in output. Complex symmetric in input, complex symmetric in output: makes sense.

[ ZHER2K ] If C is Hermitian (C = C^H) in input and, if alpha is a complex scalar, and, if beta is a real scalar, and if we perform

C = alpha * A^H *B + conjg( alpha ) * B^H * A + beta * C,

then C is Hermitian in output. Hermitian in input, Hermitian in output: makes sense.

So, in your case, I understand that you want to do:

C = alpha * A^T * B - alpha * B^T * A + gamma * C

(Is that correct?). So that would makes sense for C being complex antisymmetric (C = -C^T) for me. (Is that correct?)

[Jonas wrote] C is therefore symmetric up to a real sign, so not Hermitian.

"By symmetric up to a real sign," do you mean complex antisymmetric (C = -C^T) then?

Are the diagonal entries of C zeros? I am asking because a complex antisymmetric (C = -C^T) would have a zero diagonal.

[Jonas wrote] Since the actual cost is in computing the values (and the sign is a given symmetry aspect), we currently calculate it as if it were a symmetric matrix.

This makes sense.

[Jonas wrote] This forum is flagging this message as spam.

The SPAM filter is on the sensitive side!

Julien.

### Re: Complex version of ?sfrk

Posted:

**Wed Mar 25, 2015 12:51 pm**
by **Julien Langou**

Hi Jonas,

I would like to make sure I understand your request correctly. If you can answer the question in the previous post, that would be great and help me get the context of all this.

I am trying to understand your request. So you would like the following subroutine:

- Code: Select all
` SUBROUTINE ZSFRK( TRANSR, UPLO, TRANS, N, K, ALPHA, A, LDA, BETA, C )`

Where TRANSR is either

'N': The Normal Form of RFP C is stored;

'T': The Transpose Form of RFP C is stored.

Where UPLO is either 'U' or 'L'

Where TRANS is either 'N' or 'T'

Where we understand what N, K, ALPHA, A, LDA and BETA are

Where complex symmetric C is a matrix in RFP format

ZSFRK performs one of the symmetric rank K operations

either C = alpha * A * A^T + beta * C,

or C = alpha * A^T * A + beta * C.

Writing such a subroutine should not be too hard. We can start from ZHFRK and we would replace

- Code: Select all
` all ZGEMM( 'C', 'N', . . . ) by ZGEMM( 'T', 'N', . . . )`

all ZGEMM( 'N', 'C', . . . ) by ZGEMM( 'N', 'T', . . . )

all ZHERK( '?', 'C', . . . ) by ZSYRK( '?', 'T', . . . )

all ZHERK( '?', 'N', . . . ) by ZSYRK( '?', 'N', . . . )

and I think that would do it.

Please let me know if the interface makes sense to you and if this is what you would need.

Cheers,

Julien.

### Re: Complex version of ?sfrk

Posted:

**Mon Mar 30, 2015 4:04 am**
by **jdegreef**

Julien Langou wrote: Out of curiosity: which LAPACK routines are you using?

Right now, only the dsfrk for this part of the application. For the rest we tend to use some ?gemm routines and others for the more basic matrix handling.

Julien Langou wrote: So, in your case, I understand that you want to do:

C = alpha * A^T * B - alpha * B^T * A + gamma * C

(Is that correct?). So that would makes sense for C being complex antisymmetric (C = -C^T) for me. (Is that correct?)

Yes, that is correct, except that unfortunately the diagonal terms are typically non-zero in our case. So the matrix is neither symmetric nor anti-symmetric, yet we can take advantage of its form by storing only half of it. If there is any mathematical term for that, I am not aware of it

The interface you provided for zsfrk is exactly what I am looking for. It would be great if we could combine the speed together with a halved memory requirement!

Jonas