## GEEV Question

3 posts
• Page

**1**of**1**### GEEV Question

Hi, Kyle!

I'm using CULA 2.3 premium with CUDA 3.1.

There is something strange in using culaSgeev function.

It takes 7000ms to extract eigenvalue from 2048x2048 matrix.

But only 30ms is required by the eigenvalue code in CUDA SDK.

LAPACK Sgeev is even faster 5 times than culaSgeev with 128x128 matrix.

(CPU: Intel i5 2.67GHz, GPU: GTX480)

Here is my code. Please let me know how to resolve this problem.

Thank you in advance!

I'm using CULA 2.3 premium with CUDA 3.1.

There is something strange in using culaSgeev function.

It takes 7000ms to extract eigenvalue from 2048x2048 matrix.

But only 30ms is required by the eigenvalue code in CUDA SDK.

LAPACK Sgeev is even faster 5 times than culaSgeev with 128x128 matrix.

(CPU: Intel i5 2.67GHz, GPU: GTX480)

Here is my code. Please let me know how to resolve this problem.

Thank you in advance!

- Code: Select all
`culaStatus status = culaInitialize();`

int size = 2048;

float *dA, *dB, *dC, *dD, *dE;

float *hA = new float[size*size];

cublasAlloc(size*size, sizeof(float), (void**)&dA);

cublasAlloc(size, sizeof(float), (void**)&dB);

cublasAlloc(size, sizeof(float), (void**)&dC);

cublasAlloc(size*size, sizeof(float), (void**)&dD);

cublasAlloc(size, sizeof(float), (void**)&dE);

for (int i=0; i<size; i++)

for (int j=0; j<size; j++)

hA[i+j*size] = (float)i/(float)(j+1);

cublasSetMatrix(size, size, sizeof(float), hA, size, dA, size);

clock_t before = clock();

status = culaDeviceSgeev('N','V',size,dA,size,dB,dE,dC,1,dD,size);

float time = (float)(clock()-before);(float)CLOCKS_PER_SEC;

printf("%f", time);

culaShutdown();

- spicio
**Posts:**2**Joined:**Tue Aug 10, 2010 6:11 pm

### Re: GEEV Question

A few points:

1) The SDK example is not a full eigenvalue problem. It's an implementation of the bisection method to find eigenvalues from a tridiagonal matrix. We have a much more robust implementation via the STEBZ routine if you'd like the compare the two.

2) Almost no CULA routine will show a speed up for a matrix of size 128x128. A matrix of that size is only a few kilobytes an can easily fit in the CPU's cache. CULA is optimized for large problems that take multiple mega- or gigabytes of memory.

3) Unfortunately, the non-symmetric eigenvalue problem does not exhibit the kind of parallelism that maps well to the GPU. The performance for this routine (GEEV) is lower than that of the symmetric version (SYEV) or any of the linear solvers (GESV, etc) in CULA.

Hope this helps!

1) The SDK example is not a full eigenvalue problem. It's an implementation of the bisection method to find eigenvalues from a tridiagonal matrix. We have a much more robust implementation via the STEBZ routine if you'd like the compare the two.

2) Almost no CULA routine will show a speed up for a matrix of size 128x128. A matrix of that size is only a few kilobytes an can easily fit in the CPU's cache. CULA is optimized for large problems that take multiple mega- or gigabytes of memory.

3) Unfortunately, the non-symmetric eigenvalue problem does not exhibit the kind of parallelism that maps well to the GPU. The performance for this routine (GEEV) is lower than that of the symmetric version (SYEV) or any of the linear solvers (GESV, etc) in CULA.

Hope this helps!

- kyle
- Administrator
**Posts:**301**Joined:**Fri Jun 12, 2009 7:47 pm

### Re: GEEV Question

Thank you for the reply!

1) Yeah... I also found the SDK example is not for general matrix. It was my mistake.

2,3) I'd better use original LAPACK for solving relatively small size problem. Please update the performance report for geev function on the website.

1) Yeah... I also found the SDK example is not for general matrix. It was my mistake.

2,3) I'd better use original LAPACK for solving relatively small size problem. Please update the performance report for geev function on the website.

- spicio
**Posts:**2**Joined:**Tue Aug 10, 2010 6:11 pm

3 posts
• Page

**1**of**1**### Who is online

Users browsing this forum: No registered users and 2 guests