Least Squares with CULA

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

Least Squares with CULA

Postby ambushed » Wed Oct 19, 2011 7:05 am

Hi All,

I am looking for a least squares solver for our CUDA application and CULA looks quite promising. However when I trying to make use of GELS functionality, i get output that is not sensible. I would like to know what i am doing wrong:

Let's say I have a input data as 6 pairs of X and Y values that roughly correspond to a 2nd degree polynomial. I am forming an "A" matrix out of X values in the following way
Code: Select all
1  X1  X1*X1
1  X2  X2*X2
1  X3  X3*X3
1  X4  X4*X4
1  X5  X4*X5
1  X6  X4*X6


Matrix is stored in column-major order.

Dependent Variables form vector B with 6 components.

I pass the following arguments to the CULA function
Code: Select all
status = culaSgels('N', M, N, NRHS, A, LDA, B, LDB);

where M=6 (number of rows), N = 3 (number of columns), NRHS = 1(number of right hand sides) A is the matrix above, Leading dim of A = 6 and Leading dimension of B = 6

If i read the documentation properly, the results should be returned in memory originally allocated for B. Status is culaNoError but the results are rubbish.

Where is my mistake?

Thanks!

Vlad

Operating system: Windows Server 2008 R2
CUDA version installed: 4.0
GPU model: Tesla C2070
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Wed Oct 19, 2011 9:23 am

Hi, can you post a complete program that loads and runs this data in the exact way you are using it in your program?
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Wed Oct 19, 2011 12:56 pm

Thanks for your reply!

Code: Select all
   
    const int VECTOR_SIZE = 6;
    const int NUM_BETAS = 3; 

    h_A = (float*)malloc(sizeof(float)*VECTOR_SIZE * NUM_BETAS);
    h_Y = (float*)malloc(VECTOR_SIZE*sizeof(float));

    for(int i=1;i<=VECTOR_SIZE;i++)
    {
       int idx = i-1;
       h_A[idx] = 1;
       h_A[idx+VECTOR_SIZE] = i;
       h_A[idx+2*VECTOR_SIZE] = i*i;
        h_Y[idx] = 2*i*i-4*i + 2;
    }

    int NRHS = 1;
    int LDA = VECTOR_SIZE;
    int LDB = LDA;

    status = culaSgels('N', VECTOR_SIZE, NUM_BETAS, NRHS, h_A, LDA, h_Y, LDB);
    checkStatus(status);


I expect to get back exactly the values that I have provided in the input h_Y since there is no error term in the data.

Vlad
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Thu Oct 20, 2011 5:53 am

The values returned should be different from the input, if only because the X vector is only 3 long, where the B vector is 6. So they must be different.

I get the following:

Code: Select all
A =

     1     1     1
     1     2     4
     1     3     9
     1     4    16
     1     5    25
     1     6    36

B= (input)

     0
     2
     8
    18
    32
    50

X= (output
   2
   -4
   2
  (then 3 unused entries)
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Thu Oct 20, 2011 11:30 am

John

My bad, the output is indeed the polynomial coefficients that one would expect out of Least Squares. Somehow i thought the solution vector would be the vector of fitted Y values.

I hope you dont mind me squeezing another question. I am running LS now with a device function and it works great. There is one issue still remaining. When i make a run with 2000 rows, it gives me culaRuntimeError "Invalid configuration argument". With 1000 rows it works just fine. What could be the problem?

Thanks so much for your help!

Vlad
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Thu Oct 20, 2011 12:53 pm

Yeah, just keep in mind that CULA is a linear algebra library, so curve fitting isn't something we cover.

I don't quite understand your other question, but if you were to post a code example again, then it would probably be easy enough to spot.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Thu Oct 20, 2011 1:02 pm

For the second question the source code is essentially the same, the only difference being the number of rows in the A matrix. The original example has 6 rows, while the realistic problem of my domain has around 2000 rows (and this is where i get the error), i.e.
Code: Select all
const int VECTOR_SIZE = 2000


Least Squares is, imho, very much a linear algebra problem. It boils down to matrix multiplication, inversion and transposing..
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Thu Oct 20, 2011 2:29 pm

I receive culaNoError when I increase VECTOR_SIZE to 2000. You will need to specify further details, such as GPU type.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Thu Oct 20, 2011 2:46 pm

Operating system: Windows Server 2008 R2
CUDA version installed: 4.0
GPU model: Tesla C2070
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Fri Oct 21, 2011 12:25 pm

We've done fairly exhaustive tests using the example code you set (with VECTOR_SIZE=2000) and haven't turned up any problems. Just to doublecheck, when you run the example code alone in an executable, you receive this error?
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Fri Oct 21, 2011 1:31 pm

Two Visual Studio projects, i.e. I run example code from a boost unit-test by making a function call to a statically linked library. Both 64 bit. The same problem appears when i increase the column count from 3 to 4.
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby ambushed » Mon Oct 24, 2011 4:47 am

Thank you very much for your help. The issue is resolved and is rooted in human error. Both culaDeviceSgels and culaSgels work as expected 2000 matrix rows and higher.
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am

Re: Least Squares with CULA

Postby john » Mon Oct 24, 2011 5:24 am

Glad to hear it! In case other users have the same problem, would you mind explaining what went wrong?

Thanks!
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Least Squares with CULA

Postby ambushed » Mon Oct 24, 2011 10:06 am

The device function didn't work because my kernel didn't initialize device memory correctly, i was trying to create more than 1024 threads per block, very carelessly. Every row was initialized in separate thread. This explains why for 1000 rows it worked and for 2000 it didn't. As for host interface, i must have been drunk, it should have worked and it works fine.
We are working in a shared environment where several people are contending for the GPUs so it is easy to get confused.. Thanks for help!@
ambushed
 
Posts: 9
Joined: Thu Oct 13, 2011 5:37 am


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 1 guest

cron