sgels on "narrow" A matrix

General discussion for CULA. Use this forum for questions, examples, feedback, and feature requests.

sgels on "narrow" A matrix

Postby eyew » Mon Aug 01, 2011 8:44 am

Hi guys,

I'm working on a problem where it's necessary to repeatedly (1000's of times) solve a least squares problem A*X=y where A has dimensions 28,000,000 x 32, and y is a column vector of length 28,000,000. I've been solving this in chunks, where A is 500,000 x 32. Each of these chunks take about 3-4 seconds using the numpy least squares routine.

I wasn't expecting much speedup from the Cula sgels on a matrix of this weird shape, and it indeed performs 4-10x slower than numpy. I was thinking I could write a custom solver in CUDA that solves many "chunks" of A in parallel and then reduces to a final answer, but before embarking on that, I'm wondering if you have any ideas for how I might transform the problem to best take advantage of how CULA accelerates the computation.

Also, the A matrix chunks are around 128MB, so transferring that back and forth might be a significant contributor to the slowdown, however the A matrix is static for each iteration. If there was a way to avoid overwriting the matrices in place, potentially I'd only need to transfer the column vector to the GPU on each iteration.

Any ideas would be greatly appreciated, thanks for your time.
Posts: 1
Joined: Wed Jul 27, 2011 10:31 am

Re: sgels on "narrow" A matrix

Postby kyle » Mon Aug 01, 2011 5:58 pm

You are correct in that the skinny matrix will not see any acceleration with CULA's current algorithm. However, we are working on alternate algorithms for tall and skinny matrices using communication avoiding techniques that will certainly show a speedup.
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

Re: sgels on "narrow" A matrix

Postby ikku100 » Wed Sep 11, 2013 2:14 am

Has this been improved already, perhaps?
Posts: 4
Joined: Mon Jun 18, 2012 6:45 am

Return to General CULA Discussion

Who is online

Users browsing this forum: No registered users and 1 guest