## CULA versus MKL issues

6 posts
• Page

**1**of**1**### CULA versus MKL issues

I have written a code that solves system of linear equations both on the CPU (using MKL 10.3.7 sgesv) and GPU (using CULA Dense R13 culaSgesv). I am then calculating the error between filling the coefficient matrix ( lhs matrix) on the CPU and the GPU. In a perfect world the error should be zero. I also determined the RMS error of the solution from the CPU and the GPU.

The RMS error for the CPU and GPU are approximately the same upto 100 unknowns. After 100 unknowns the errors differ from each other significantly and instead of getting smaller, the error is jumpin all over the place. I have attached my code. I am using Visual Studios 2008 X64 and CUDA 4.0 with Geforce GT 525M.

This is the result I am getting which is also attached. I am not sure what is causing the error since I have a similar code in Matlab(minus GPU code) in which the error reduces to about 2.0E-003 and levels off as I increase the "Numnodes". Also the error calculation from the C program is similar to that of the Matlab upto 100 unknowns. I am not sure what is happening after the 100 unknowns. Could it be configuration problems(MKL and CULA running at the same time)? I am not sure what is causing the error. Please any help in the right direction will be greatly appreciated. In case some one wants to see my matlab code which works, I will be glad to post that too.

The RMS error for the CPU and GPU are approximately the same upto 100 unknowns. After 100 unknowns the errors differ from each other significantly and instead of getting smaller, the error is jumpin all over the place. I have attached my code. I am using Visual Studios 2008 X64 and CUDA 4.0 with Geforce GT 525M.

This is the result I am getting which is also attached. I am not sure what is causing the error since I have a similar code in Matlab(minus GPU code) in which the error reduces to about 2.0E-003 and levels off as I increase the "Numnodes". Also the error calculation from the C program is similar to that of the Matlab upto 100 unknowns. I am not sure what is happening after the 100 unknowns. Could it be configuration problems(MKL and CULA running at the same time)? I am not sure what is causing the error. Please any help in the right direction will be greatly appreciated. In case some one wants to see my matlab code which works, I will be glad to post that too.

- Code: Select all
`Numnodes Alpha CPU Error GPU Error`

16 1.170000 7.101676E-002 7.101548E-002

36 3.250000 8.908086E-003 8.904453E-003

64 6.369999 2.478148E-003 2.479677E-003

100 10.530000 1.537363E-003 1.497382E-003

[b]144 15.729999 2.007023E-003 6.758709E-003

196 21.969997 5.164282E-003 2.743981E-003

256 29.249996 6.399789E-003 3.003306E-003

324 37.570000 6.843160E-002 5.168200E-003

400 46.929996 8.977773E-003 1.528272E-002

484 57.329998 4.600475E-002 5.412453E-003

576 68.769997 2.689121E-002 9.519931E-003

676 81.250000 8.117091E-002 1.732562E-002

784 94.769997 1.850877E-001 4.871070E-002

900 109.330002 6.828754E-002 4.783689E-002

1024 124.930008 2.539281E-002 8.547682E-002

1156 141.569992 3.487872E-001 1.783528E-001

1296 159.250000 2.891807E-001 8.317184E-002

1444 177.969986 1.070141E-001 8.043955E-002

1600 197.729996 3.038914E-001 2.393741E-001

[/b]

- fienefie
**Posts:**6**Joined:**Fri Jul 08, 2011 10:34 am

### Re: CULA versus MKL issues

How are you testing for error? If you are doing ||GPU-CPU|| you'll never get a "perfect" answer; this is the nature of floating point math.

If you want to properly test for correctness, you should examine ||b-Ax|| on both the CPU and GPU. Also, if precision is of the utmost importance, double precision routines like 'dgesv' will get you better results!

Hope this helps.

If you want to properly test for correctness, you should examine ||b-Ax|| on both the CPU and GPU. Also, if precision is of the utmost importance, double precision routines like 'dgesv' will get you better results!

Hope this helps.

- kyle
- Administrator
**Posts:**301**Joined:**Fri Jun 12, 2009 7:47 pm

### Re: CULA versus MKL issues

kyle wrote:How are you testing for error? If you are doing ||GPU-CPU|| you'll never get a "perfect" answer; this is the nature of floating point math.

If you want to properly test for correctness, you should examine ||b-Ax|| on both the CPU and GPU. Also, if precision is of the utmost importance, double precision routines like 'dgesv' will get you better results!

Hope this helps.

Thanks. For the error I am using this pseudocode

for loop

cumerror+=fabs(num-exact)*fabs(num-exact);

end

cumerror = sqrt(cumerr/numnode);

U can see the code in laplace_2D.cpp.

I am also going to test with double precision. But I donot understand why the error after 100 unknowns seem to be random and not approximately the same. Am not sure what is happenning when I increase the unknowns past 100.

- fienefie
**Posts:**6**Joined:**Fri Jul 08, 2011 10:34 am

### Re: CULA versus MKL issues

I will echo what Kyle says, the true test of accuracy is ||b-A*x||. Your test calculates an x' which you assume is exact and then compares ||x-x'||, but since your x' is calculated with floating-point arithmetic it also has some degree of inaccuracy to it. So when you compare two solutions, the inaccuracies can compound.

So really you should look at the quality of ||b'-A*x'|| versus ||b-A*x||. I think you'll find that they are more similar.

So really you should look at the quality of ||b'-A*x'|| versus ||b-A*x||. I think you'll find that they are more similar.

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

### Re: CULA versus MKL issues

john wrote:I will echo what Kyle says, the true test of accuracy is ||b-A*x||. Your test calculates an x' which you assume is exact and then compares ||x-x'||, but since your x' is calculated with floating-point arithmetic it also has some degree of inaccuracy to it. So when you compare two solutions, the inaccuracies can compound.

So really you should look at the quality of ||b'-A*x'|| versus ||b-A*x||. I think you'll find that they are more similar.

The error I am computing is root mean square deviation(RMSD) = sqrt((sum((exact-num)^2))/n). It is also a good measure of accuracy

- fienefie
**Posts:**6**Joined:**Fri Jul 08, 2011 10:34 am

### Re: CULA versus MKL issues

That would be an acceptable answer, except that your "exact" answer also contains sources of numerical error, such as the sine/cosine operations and the division. I do believe that these errors are compounding.

- john
- Administrator
**Posts:**587**Joined:**Thu Jul 23, 2009 2:31 pm

6 posts
• Page

**1**of**1**Return to General CULA Discussion

### Who is online

Users browsing this forum: No registered users and 1 guest