## Plasma sgetrf LU factorization

Open forum for general discussions relating to PLASMA.

### Plasma sgetrf LU factorization

Hi,

Here is a newbie question.

I am trying to understand how Plasma LU factorization exactly works.

For the following input data:
0.1206 0.6438 0.0623 0.4903 0.3061 0.8164 0.9972 0.4246 0.7675 0.8468
0.1681 0.4045 0.3025 0.7730 0.3156 0.8355 0.3103 0.4922 0.0378 0.6989
0.1704 0.4167 0.1199 0.2274 0.3527 0.1086 0.8734 0.9629 0.5332 0.4056
0.8503 0.1604 0.7382 0.8833 0.3093 0.4463 0.0403 0.9273 0.7538 0.5861
0.7008 0.9463 0.4652 0.3890 0.4388 0.3014 0.8448 0.0802 0.6985 0.4762
0.1508 0.5055 0.8133 0.1878 0.3805 0.5890 0.6836 0.0463 0.0702 0.4994
0.2996 0.6929 0.0510 0.6763 0.8922 0.9749 0.0469 0.5502 0.8524 0.7054
0.6977 0.5114 0.4439 0.1681 0.3435 0.1548 0.7888 0.1991 0.8604 0.4674
0.1261 0.0402 0.0741 0.5236 0.8497 0.0292 0.8196 0.2044 0.6902 0.5695
0.5102 0.9229 0.4151 0.9063 0.0368 0.4884 0.8241 0.3960 0.7545 0.9193

the PLASMA_sgetrf_incpiv function returns Y=
0.9972 0.4917 0.4258 0.7696 0.1210 0.0625 0.3069 0.6457 0.8187 0.8492
0.3103 0.6204 0.5804 -0.3240 0.2105 0.4562 0.3552 0.3291 0.9371 0.7018
0.8734 -0.2020 0.7083 -0.2886 0.1514 0.2223 0.2207 -0.1140 -0.5890 -0.2744
0.0403 0.8635 0.4089 1.1207 0.5370 0.2238 -0.0893 -0.0920 -0.1383 0.0518
0.8448 -0.0263 -0.2643 -0.0365 0.6638 0.7402 0.3675 0.5665 -0.7928 -0.4420
0.6836 -0.1483 -0.1587 -0.5498 0.4186 0.6868 0.0807 -0.2808 0.4815 0.2810
0.0469 0.6533 0.1511 1.0716 -0.4419 -0.1963 0.8862 0.8560 0.3449 0.0596
0.7888 -0.2197 -0.0093 0.1793 0.5536 0.0470 -0.0098 -0.2022 -0.7608 -0.8501
0.8196 0.1206 -0.2146 0.0365 0.0145 -0.0033 0.6009 -1.0733 -1.8870 0.6428
0.8241 0.5011 -0.2458 0.2117 0.2286 -0.0270 -0.4028 0.4250 -0.1148 0.3572

and pivot indices
[7 4 8 9 7 8 7 9 9 10]

Which means L=
1.0000 0 0 0 0 0 0 0 0 0
0.3103 1.0000 0 0 0 0 0 0 0 0
0.8734 -0.2020 1.0000 0 0 0 0 0 0 0
0.0403 0.8635 0.4089 1.0000 0 0 0 0 0 0
0.8448 -0.0263 -0.2643 -0.0365 1.0000 0 0 0 0 0
0.6836 -0.1483 -0.1587 -0.5498 0.4186 1.0000 0 0 0 0
0.0469 0.6533 0.1511 1.0716 -0.4419 -0.1963 1.0000 0 0 0
0.7888 -0.2197 -0.0093 0.1793 0.5536 0.0470 -0.0098 1.0000 0 0
0.8196 0.1206 -0.2146 0.0365 0.0145 -0.0033 0.6009 -1.0733 1.0000 0
0.8241 0.5011 -0.2458 0.2117 0.2286 -0.0270 -0.4028 0.4250 -0.1148 1.0000U=

and U=
0.9972 0.4917 0.4258 0.7696 0.1210 0.0625 0.3069 0.6457 0.8187 0.8492
0 0.6204 0.5804 -0.3240 0.2105 0.4562 0.3552 0.3291 0.9371 0.7018
0 0 0.7083 -0.2886 0.1514 0.2223 0.2207 -0.1140 -0.5890 -0.2744
0 0 0 1.1207 0.5370 0.2238 -0.0893 -0.0920 -0.1383 0.0518
0 0 0 0 0.6638 0.7402 0.3675 0.5665 -0.7928 -0.4420
0 0 0 0 0 0.6868 0.0807 -0.2808 0.4815 0.2810
0 0 0 0 0 0 0.8862 0.8560 0.3449 0.0596
0 0 0 0 0 0 0 -0.2022 -0.7608 -0.8501
0 0 0 0 0 0 0 0 -1.8870 0.6428
0 0 0 0 0 0 0 0 0 0.3572

and if I am calculating P correctly it should be:
0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
1 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1

This result does not give PA=LU or A=PLU

The same input in matlab returns Y=
0.8503 0.1604 0.7382 0.8833 0.3093 0.4463 0.0403 0.9273 0.7538 0.5861
0.6000 0.8266 -0.0279 0.3763 -0.1488 0.2206 0.7999 -0.1604 0.3022 0.5676
0.1773 0.5772 0.6985 -0.1860 0.4115 0.3826 0.2148 -0.0256 -0.2379 0.0679
0.8205 0.4594 -0.2133 -0.7692 0.2458 -0.2312 0.4341 -0.4936 0.0523 -0.2598
0.3523 0.7699 -0.2686 -0.0331 1.0163 0.7429 -0.5110 0.3238 0.2920 0.0715
0.1483 0.0198 -0.0499 -0.4886 0.9322 -0.8278 1.4970 -0.4742 0.3138 0.2811
0.1419 0.7514 -0.0307 -0.0995 0.4044 -0.3330 1.1453 0.0748 0.4178 0.3781
0.2004 0.4652 -0.0215 0.1672 0.3224 0.3334 0.0793 0.9815 -0.0042 -0.0777
0.1977 0.4510 0.2421 -0.6158 0.3673 -0.1690 0.5217 -0.1575 -0.4306 -0.0377
0.8242 0.9848 -0.1658 0.9625 0.1594 0.1403 -0.4252 -0.0087 0.5180 -0.1758

which gives L=
1.0000 0 0 0 0 0 0 0 0 0
0.6000 1.0000 0 0 0 0 0 0 0 0
0.1773 0.5772 1.0000 0 0 0 0 0 0 0
0.8205 0.4594 -0.2133 1.0000 0 0 0 0 0 0
0.3523 0.7699 -0.2686 -0.0331 1.0000 0 0 0 0 0
0.1483 0.0198 -0.0499 -0.4886 0.9322 1.0000 0 0 0 0
0.1419 0.7514 -0.0307 -0.0995 0.4044 -0.3330 1.0000 0 0 0
0.2004 0.4652 -0.0215 0.1672 0.3224 0.3334 0.0793 1.0000 0 0
0.1977 0.4510 0.2421 -0.6158 0.3673 -0.1690 0.5217 -0.1575 1.0000 0
0.8242 0.9848 -0.1658 0.9625 0.1594 0.1403 -0.4252 -0.0087 0.5180 1.0000

U=
0.8503 0.1604 0.7382 0.8833 0.3093 0.4463 0.0403 0.9273 0.7538 0.5861
0 0.8266 -0.0279 0.3763 -0.1488 0.2206 0.7999 -0.1604 0.3022 0.5676
0 0 0.6985 -0.1860 0.4115 0.3826 0.2148 -0.0256 -0.2379 0.0679
0 0 0 -0.7692 0.2458 -0.2312 0.4341 -0.4936 0.0523 -0.2598
0 0 0 0 1.0163 0.7429 -0.5110 0.3238 0.2920 0.0715
0 0 0 0 0 -0.8278 1.4970 -0.4742 0.3138 0.2811
0 0 0 0 0 0 1.1453 0.0748 0.4178 0.3781
0 0 0 0 0 0 0 0.9815 -0.0042 -0.0777
0 0 0 0 0 0 0 0 -0.4306 -0.0377
0 0 0 0 0 0 0 0 0 -0.1758

P=
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 1 0
1 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0

which indeed does return PA=LU

What am I doing wrong here? Can someone please explain me the results. Thanks
russell_anam

Posts: 3
Joined: Mon Apr 02, 2012 10:47 am

### Re: Plasma sgetrf LU factorization

I would recommend that you use the dgetrf routine, rather than the dgetrf_incpiv routine.
It will give you the textbook A=PLU factorization and pivot vector identical to LAPACK.
If you want to understand how exactly the dgetrf_incpiv routine works,
probably the best thing you can do is read appropriate LAPACK working notes, e.g.:
http://www.netlib.org/lapack/lawnspdf/lawn191.pdf
http://www.netlib.org/lapack/lawnspdf/lawn213.pdf
Best,
Jakub

Posts: 84
Joined: Wed May 13, 2009 1:27 pm

### Re: Plasma sgetrf LU factorization

Thank you very much for your reply. I also thought about trying the dgetrf routine but I think they have different parameters than the "dgetrf_incpiv" one as just replacing the function name does not work (the provided examples in plasma 2.4.5 still use dgetrf_incpiv and not dgetrf) and did not compile. The routine documentation (http://icl.cs.utk.edu/projectsfiles/pla ... utine.html) still points to the earlier parameters. Should I assume that all the calling parameters are exactly the same as LAPACK dgetrf? Thanks
russell_anam

Posts: 3
Joined: Mon Apr 02, 2012 10:47 am

### Re: Plasma sgetrf LU factorization

Thanks for pointing out this problem. We have to update the online documentation. But if you want to know what are the parameters of getrf function, they are the same than in Lapack.

For the incpiv version, the pivoting is not done in the same way:
- getrf is using partial pivoting: search the maximum on each column and pivots.
- incpiv is searching for a pivot only inside the diagonal tile first, and then when it applies the update on the next tile of the panel, if it finds a better choice for the pivot, update the U and store the new row for L in the extra parameter L that you have to give to the function. In this case, the pivot is not only an array, but a lower triangular matrix to store all updates done to the U. And then, it will be different from what you are doing with matlab which is partial pivoting.

Mathieu
mateo70

Posts: 108
Joined: Fri May 07, 2010 3:48 pm

### Re: Plasma sgetrf LU factorization

Hi Mathieu,

Thank you very much. Is there anyway I can get L, U and P from sgetrf or to rephrase how can I extract L, U and P from the results returned by sgetrf. Is there any code, function or any working plasma notes that can help me to achieve this? I ask because I want to use sgetrf as a standalone LU decomposition function and not feed the results to a solver. Any help is appreciated.

Thanks

Russell
russell_anam

Posts: 3
Joined: Mon Apr 02, 2012 10:47 am

### Re: Plasma sgetrf LU factorization

Hello,

The function PLASMA_sgetrf has exactly the Sam interface and comportment as the sgetrf function from LAPACK. Since our documentation is unfortunately out of date, you can look at the header in the LAPACK file :
http://www.netlib.org/lapack/single/sgetrf.f

Mathieu
mateo70

Posts: 108
Joined: Fri May 07, 2010 3:48 pm

### Re: Plasma sgetrf LU factorization

Hello,

A better link to what I gave you:
[utl]http://www.netlib.org/lapack/explore-html/d8/ddc/group__real_g_ecomputational.html#gaff540c229377c5a167d4ec4e011234b9[/url]

Mathieu
mateo70

Posts: 108
Joined: Fri May 07, 2010 3:48 pm