geqrf basic question
10 posts
• Page 1 of 1
geqrf basic question
Hi, I'm new to CULA, can someone explain to me the parameters on Sgeqrf?
m - # of rows in A
n - # of columns in A
a - pointer to matrix A
Ida - leading dimension of A, so Ida = m?
tau - scalar factors of elementary reflectors??
So if I call culaSgeqrf(...), where does the output Q and R matrices get stored?
Another question, what's the difference between culaSgeqrf and culaDeviceSgeqrf? If I use culaDeviceSgeqrf, do I have to allocate memory space/copy memory to GPU, like CUDA?
Sorry for the silly questions, the CULA programming guide wasn't very beginner-friendly.
Thanks in advance!
m - # of rows in A
n - # of columns in A
a - pointer to matrix A
Ida - leading dimension of A, so Ida = m?
tau - scalar factors of elementary reflectors??
So if I call culaSgeqrf(...), where does the output Q and R matrices get stored?
Another question, what's the difference between culaSgeqrf and culaDeviceSgeqrf? If I use culaDeviceSgeqrf, do I have to allocate memory space/copy memory to GPU, like CUDA?
Sorry for the silly questions, the CULA programming guide wasn't very beginner-friendly.
Thanks in advance!
- jinyan
- Posts: 4
- Joined: Mon Jul 19, 2010 9:07 am
Re: geqrf basic question
CULA's QR decomposition is implemented using Householder reflections. Check out the Wikipedia page for some decent information if you are unfamiliar with the algorithm.
LDA is typically M, but doesn't have to be. TAU is used to construct (or multiply) the Q matrix after xGEQRF.
After xGEQRF, R is stored in the upper triangular portion of A. Q can then be generated using xORGQR with A and TAU as the inputs. Alternatively, Q can multiplied directly with another matrix using xORMQR with A, TAU, and C as the inputs. This two step method is typical and will be seen in any package based on the LAPACK interface. When you call QR in MATLAB for example, it calls both of these functions behinds the scenes. Also, it's worth nothing that xORGQR and xORMQR are only available in CULA Premium.
culaSgeqrf expects host memory and culaDeviceSgeqrf expects device memory. We always recommend the standard host interface because it's simpler and uses special allocation methods that maximizes memory throughput.
Not a problem! The guide is more focused at developers who are familiar with LAPACK notation but new to GPU computing. Let us know if you have any other questions.
jinyan wrote:m - # of rows in A
n - # of columns in A
a - pointer to matrix A
Ida - leading dimension of A, so Ida = m?
tau - scalar factors of elementary reflectors??
LDA is typically M, but doesn't have to be. TAU is used to construct (or multiply) the Q matrix after xGEQRF.
jinyan wrote:So if I call culaSgeqrf(...), where does the output Q and R matrices get stored?
After xGEQRF, R is stored in the upper triangular portion of A. Q can then be generated using xORGQR with A and TAU as the inputs. Alternatively, Q can multiplied directly with another matrix using xORMQR with A, TAU, and C as the inputs. This two step method is typical and will be seen in any package based on the LAPACK interface. When you call QR in MATLAB for example, it calls both of these functions behinds the scenes. Also, it's worth nothing that xORGQR and xORMQR are only available in CULA Premium.
jinyan wrote:Another question, what's the difference between culaSgeqrf and culaDeviceSgeqrf? If I use culaDeviceSgeqrf, do I have to allocate memory space/copy memory to GPU, like CUDA?
culaSgeqrf expects host memory and culaDeviceSgeqrf expects device memory. We always recommend the standard host interface because it's simpler and uses special allocation methods that maximizes memory throughput.
jinyan wrote:Sorry for the silly questions, the CULA programming guide wasn't very beginner-friendly.
Not a problem! The guide is more focused at developers who are familiar with LAPACK notation but new to GPU computing. Let us know if you have any other questions.
- kyle
- Administrator
- Posts: 301
- Joined: Fri Jun 12, 2009 7:47 pm
Re: geqrf basic question
Thanks for the help, Kyle!
One more question, I'm getting this error while trying to run the executable
error while loading shared libraries: libcula.so: cannot open shared object file: No such file or directory
I saw your reply on the other post "make sure to include...in LD_LIBRARY_PATH", I tried looking around on google and still couldn't figure out how to change LD_LIBRARY_PATH. Can you help? I'm fairly new to linux too
One more question, I'm getting this error while trying to run the executable
error while loading shared libraries: libcula.so: cannot open shared object file: No such file or directory
I saw your reply on the other post "make sure to include...in LD_LIBRARY_PATH", I tried looking around on google and still couldn't figure out how to change LD_LIBRARY_PATH. Can you help? I'm fairly new to linux too

- jinyan
- Posts: 4
- Joined: Mon Jul 19, 2010 9:07 am
Re: geqrf basic question
To add CULA to your library path, try the following command.
Change to 'lib' if you are on a 32-bit system.
- Code: Select all
export LD_LIBRARY_PATH=/usr/local/cula/lib64:$LD_LIBRARY_PATH
Change to 'lib' if you are on a 32-bit system.
- kyle
- Administrator
- Posts: 301
- Joined: Fri Jun 12, 2009 7:47 pm
Re: geqrf basic question
You can add that to your .bashrc if you want to make it permanent.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
Re: geqrf basic question
woohoo! okay program is working, well, only for square matrices right now. the code I wrote for CPU calculation is row-major, so the column-major implementation in CULA is causing some errors in rectangular matrices. I'll fix that later.
Another question, when I try to include cutil.h to use the cut...Timer functions, it says cutil.h cannot be found. How can i tell the program to look for the header in CUDA SDK folder?
Another question, when I try to include cutil.h to use the cut...Timer functions, it says cutil.h cannot be found. How can i tell the program to look for the header in CUDA SDK folder?
- jinyan
- Posts: 4
- Joined: Mon Jul 19, 2010 9:07 am
Re: geqrf basic question
Check your compiler for documentation on how to search for additional include directories.
In GCC it's:
If you are using cutil.h, remember you'll have to link against the cutil library as well.
In GCC it's:
- Code: Select all
-I dir
If you are using cutil.h, remember you'll have to link against the cutil library as well.
- kyle
- Administrator
- Posts: 301
- Joined: Fri Jun 12, 2009 7:47 pm
Re: geqrf basic question
If you would prefer to avoid cutil, you'll find some simple timing code in the CULA examples/benchmark folder.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
Re: geqrf basic question
Thanks again. I ended up using clock(), it seems to be the more precise out of the two (clock() vs time()).
Is there a function in CULA that will allow me to efficiently transpose a matrix? I couldn't find one in manual.
Is there a function in CULA that will allow me to efficiently transpose a matrix? I couldn't find one in manual.
- jinyan
- Posts: 4
- Joined: Mon Jul 19, 2010 9:07 am
Re: geqrf basic question
I'm afraid there isn't. If your matrix is square or if you are willing to do an out-of-place transpose, the CUDA code is pretty simple and very fast. The Nvidia GPU SDK has an okay implementation.
- john
- Administrator
- Posts: 587
- Joined: Thu Jul 23, 2009 2:31 pm
10 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest