## geqrf basic question

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

### geqrf basic question

Hi, I'm new to CULA, can someone explain to me the parameters on Sgeqrf?

m - # of rows in A
n - # of columns in A
a - pointer to matrix A
Ida - leading dimension of A, so Ida = m?
tau - scalar factors of elementary reflectors??

So if I call culaSgeqrf(...), where does the output Q and R matrices get stored?

Another question, what's the difference between culaSgeqrf and culaDeviceSgeqrf? If I use culaDeviceSgeqrf, do I have to allocate memory space/copy memory to GPU, like CUDA?

Sorry for the silly questions, the CULA programming guide wasn't very beginner-friendly.

jinyan

Posts: 4
Joined: Mon Jul 19, 2010 9:07 am

### Re: geqrf basic question

CULA's QR decomposition is implemented using Householder reflections. Check out the Wikipedia page for some decent information if you are unfamiliar with the algorithm.

jinyan wrote:m - # of rows in A
n - # of columns in A
a - pointer to matrix A
Ida - leading dimension of A, so Ida = m?
tau - scalar factors of elementary reflectors??

LDA is typically M, but doesn't have to be. TAU is used to construct (or multiply) the Q matrix after xGEQRF.

jinyan wrote:So if I call culaSgeqrf(...), where does the output Q and R matrices get stored?

After xGEQRF, R is stored in the upper triangular portion of A. Q can then be generated using xORGQR with A and TAU as the inputs. Alternatively, Q can multiplied directly with another matrix using xORMQR with A, TAU, and C as the inputs. This two step method is typical and will be seen in any package based on the LAPACK interface. When you call QR in MATLAB for example, it calls both of these functions behinds the scenes. Also, it's worth nothing that xORGQR and xORMQR are only available in CULA Premium.

jinyan wrote:Another question, what's the difference between culaSgeqrf and culaDeviceSgeqrf? If I use culaDeviceSgeqrf, do I have to allocate memory space/copy memory to GPU, like CUDA?

culaSgeqrf expects host memory and culaDeviceSgeqrf expects device memory. We always recommend the standard host interface because it's simpler and uses special allocation methods that maximizes memory throughput.

jinyan wrote:Sorry for the silly questions, the CULA programming guide wasn't very beginner-friendly.

Not a problem! The guide is more focused at developers who are familiar with LAPACK notation but new to GPU computing. Let us know if you have any other questions.
kyle

Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

### Re: geqrf basic question

Thanks for the help, Kyle!

One more question, I'm getting this error while trying to run the executable

error while loading shared libraries: libcula.so: cannot open shared object file: No such file or directory

I saw your reply on the other post "make sure to include...in LD_LIBRARY_PATH", I tried looking around on google and still couldn't figure out how to change LD_LIBRARY_PATH. Can you help? I'm fairly new to linux too
jinyan

Posts: 4
Joined: Mon Jul 19, 2010 9:07 am

### Re: geqrf basic question

Code: Select all
`export LD_LIBRARY_PATH=/usr/local/cula/lib64:\$LD_LIBRARY_PATH`

Change to 'lib' if you are on a 32-bit system.
kyle

Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

### Re: geqrf basic question

You can add that to your .bashrc if you want to make it permanent.
john

Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

### Re: geqrf basic question

woohoo! okay program is working, well, only for square matrices right now. the code I wrote for CPU calculation is row-major, so the column-major implementation in CULA is causing some errors in rectangular matrices. I'll fix that later.

Another question, when I try to include cutil.h to use the cut...Timer functions, it says cutil.h cannot be found. How can i tell the program to look for the header in CUDA SDK folder?
jinyan

Posts: 4
Joined: Mon Jul 19, 2010 9:07 am

### Re: geqrf basic question

Check your compiler for documentation on how to search for additional include directories.

In GCC it's:

Code: Select all
`-I dir`

If you are using cutil.h, remember you'll have to link against the cutil library as well.
kyle

Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

### Re: geqrf basic question

If you would prefer to avoid cutil, you'll find some simple timing code in the CULA examples/benchmark folder.
john

Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

### Re: geqrf basic question

Thanks again. I ended up using clock(), it seems to be the more precise out of the two (clock() vs time()).

Is there a function in CULA that will allow me to efficiently transpose a matrix? I couldn't find one in manual.
jinyan

Posts: 4
Joined: Mon Jul 19, 2010 9:07 am

### Re: geqrf basic question

I'm afraid there isn't. If your matrix is square or if you are willing to do an out-of-place transpose, the CUDA code is pretty simple and very fast. The Nvidia GPU SDK has an okay implementation.
john