GEEV Question

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

GEEV Question

Postby spicio » Wed Nov 10, 2010 12:56 am

Hi, Kyle!

I'm using CULA 2.3 premium with CUDA 3.1.
There is something strange in using culaSgeev function.
It takes 7000ms to extract eigenvalue from 2048x2048 matrix.
But only 30ms is required by the eigenvalue code in CUDA SDK.
LAPACK Sgeev is even faster 5 times than culaSgeev with 128x128 matrix.
(CPU: Intel i5 2.67GHz, GPU: GTX480)

Here is my code. Please let me know how to resolve this problem.

Thank you in advance!

Code: Select all
culaStatus status = culaInitialize();

int size = 2048;

float *dA, *dB, *dC, *dD, *dE;
float *hA = new float[size*size];

cublasAlloc(size*size, sizeof(float), (void**)&dA);
cublasAlloc(size, sizeof(float), (void**)&dB);
cublasAlloc(size, sizeof(float), (void**)&dC);
cublasAlloc(size*size, sizeof(float), (void**)&dD);
cublasAlloc(size, sizeof(float), (void**)&dE);

for (int i=0; i<size; i++)
   for (int j=0; j<size; j++)
      hA[i+j*size] = (float)i/(float)(j+1);

cublasSetMatrix(size, size, sizeof(float), hA, size, dA, size);

clock_t before = clock();

status = culaDeviceSgeev('N','V',size,dA,size,dB,dE,dC,1,dD,size);

float time = (float)(clock()-before);(float)CLOCKS_PER_SEC;

printf("%f", time);

culaShutdown();
spicio
 
Posts: 2
Joined: Tue Aug 10, 2010 6:11 pm

Re: GEEV Question

Postby kyle » Thu Nov 11, 2010 12:34 pm

A few points:

1) The SDK example is not a full eigenvalue problem. It's an implementation of the bisection method to find eigenvalues from a tridiagonal matrix. We have a much more robust implementation via the STEBZ routine if you'd like the compare the two.

2) Almost no CULA routine will show a speed up for a matrix of size 128x128. A matrix of that size is only a few kilobytes an can easily fit in the CPU's cache. CULA is optimized for large problems that take multiple mega- or gigabytes of memory.

3) Unfortunately, the non-symmetric eigenvalue problem does not exhibit the kind of parallelism that maps well to the GPU. The performance for this routine (GEEV) is lower than that of the symmetric version (SYEV) or any of the linear solvers (GESV, etc) in CULA.

Hope this helps!
kyle
Administrator
 
Posts: 301
Joined: Fri Jun 12, 2009 7:47 pm

Re: GEEV Question

Postby spicio » Thu Nov 11, 2010 5:26 pm

Thank you for the reply!

1) Yeah... I also found the SDK example is not for general matrix. It was my mistake.
2,3) I'd better use original LAPACK for solving relatively small size problem. Please update the performance report for geev function on the website.
spicio
 
Posts: 2
Joined: Tue Aug 10, 2010 6:11 pm


Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 5 guests

cron