## Sequential SVD computation for Big Data using MAGMA

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Boxed Cylon
Posts: 36
Joined: Sat Nov 21, 2009 6:03 pm

### Sequential SVD computation for Big Data using MAGMA

I've been away from this forum for a while! I have a new 1080 ti and am contemplating putting it through its paces by computing a large SVD calculation. So large, it is likely to need a sequential strategy, if possible. I noted the recent paper "Out of Memory SVD Solver for Big Data" (Sept. 2017) on the icl.cs.utk.edu website.

My question is what is the present situation with a practical implementation on a device such as what I have? The platform is an i7 6850K (6 core) with 32 GB RAM. The matrices I have in mind to compute the eigenvectors/values of are 10MX10M, say (speaking ambitiously) and they are a covariance, hence they are real, square, single precision and have mirror upper and lower values. I really would just need a small subset of eigenvectors (1000?) - those with the larges eigenvalues (I might perhaps start with smaller, more modest, matrices in getting started...)

Any comments/suggestions on how to get started in computing such a problem? Perhaps not yet possible?

Thx,
B-C

haidar
Posts: 22
Joined: Fri Sep 19, 2014 3:43 pm

### Re: Sequential SVD computation for Big Data using MAGMA

Dear B-C,
The paper you refer to is a experimental code for CPU multicore, it does not use GPU.
your matrix is square, you might look at this paper
The paper include formula to calculate the expected time for a calculation, so you can first plug your number and get how many hours/days
an out of memory SVD will cost you and if the number is reasonable then we can work together to let you use the experimental code.

Azzam

Boxed Cylon
Posts: 36
Joined: Sat Nov 21, 2009 6:03 pm

### Re: Sequential SVD computation for Big Data using MAGMA

Thanks for the reply and offer - I've been thinking about it.

I had reached the same conclusion - that is, computations of an SVD in parallel is still an active area of research, computation in a sequential fashion is even more towards research, and implementing these on a GPU is even further out there.

I've fussed with my own situation, and I find my new 1080 ti can compute an SVD of a 25000X25000 matrix in a few minutes. That's about the maximum size (perhaps a little bigger, 30000X30000 crashes). 25000X25000 single precision is 2.5 GB of memory. My particular problem involves ocean modeling, and I can perhaps just barely get by with state vector sizes 25000 - it means greatly reducing the spatial resolution I can use.

Your offer is interesting, but... I see it is serious ( c.f., http://www.netlib.org/utk/people/JackDo ... papers.htm ; I took a look at the paper you mentioned.)...I doubt I could keep up with you, even if I had the time to devote to the problem. My knowledge of coding in C amounts to a primitive "monkey see, monkey do". The above SVD was computed using an older matlab SVD/gpu. What I want to do next is write a mex file to use the more recent MAGMA SVD, which I am sure I can do. Technically, I would only have to compute the SVD once, assuming I get the right answer (there is only one ocean), so that if a larger vector would take longer to compute (days, a week), it would not bother me. From the numbers in the paper, an SVD of a 1MX1M matrix would actually seem manageable in computation time, with patience.

Perhaps I can let the problem simmer on the back burner and make progress here and there. I gather one has to implement an algorithm of a sequence of data transfers and tiled matrix manipulations to compute a sequential SVD. I've done simpler sets of computations using matlab mex files and MAGMA (pre matlab parallel toolbox). So perhaps I could cobble together a sequential SVD with your guidance that way. It occurs to me that it might even be possible to compute a sequential SVD just using what is available in matlab (?), if one were clever. That would make it easy.

Anyways, you see the issue - it would be nice to be able to compute the large-sized SVDs, but I am reluctant (and likely unable) to get too involved on the research side of things. It bears thinking about, however. There is an obvious general need for the ability to compute SVDs of extra-large matrices.