Page **1** of **2**

### LU decomposition in CULA

Posted:

**Tue Aug 10, 2010 8:11 am**
by **athlonshi**

Hi,

For a large sparse matrix, there are some tricks to accelerate LU decomposition.

I am wondering if the current LU decomposition solver DGETRF has already incorporate those tricks or it is totally based on the algorithm designed for dense matrix. If not, will it be considered to be the future feature? Thanks

Yu

### Re: LU decomposition in CULA

Posted:

**Wed Aug 11, 2010 7:26 am**
by **athlonshi**

I have noticed that there was a discussion regarding if the CULA package had implemented sparse matrix solvers in last year on this forum. The answer was no.

I am a CULA 2.0 premium user and I did see significant speed-up using CULA LU decomposition solver DGETRF compared to DGEFA in LAPACK. But since my matrix is a sparse one, I also used a sparse matrix solver to compare and found no advantage of using CULA.

If the solvers in CULA 2.0 are designed for general matrix operation, that means there are still a lot of room to further improve their performance, which is fantastic.

So, I just want to confirm that CULA 2.0 has not yet implemented any sparse matrix solver. And I hope this has been considered by CULA developers. Thanks!

Yu

### Re: LU decomposition in CULA

Posted:

**Wed Aug 11, 2010 8:51 am**
by **kyle**

You are correct in all your assumptions of the LU decomposition in CULA; the solver is designed for large dense matrices and does not consider sparseness. However, we are working on specialized sparse solvers (iterative and direct) that will be available in a later version of CULA.

### Re: LU decomposition in CULA

Posted:

**Wed Aug 11, 2010 8:52 am**
by **kyle**

Regarding your problem specifically, what sparse solver are you currently using?

### Re: LU decomposition in CULA

Posted:

**Wed Aug 11, 2010 8:56 am**
by **athlonshi**

Hi, Kyle:

Thanks for your reply!

I am not quite sure about the details of the sparse matrix solver that I am using.

It is integrated in a commercial software specifically for my research problem.

Yu

### Re: LU decomposition in CULA

Posted:

**Wed Aug 11, 2010 11:45 am**
by **john**

If you could PM myself and/or Kyle with the name of the commercial package, it might be helpful.

### Re: LU decomposition in CULA

Posted:

**Tue Oct 19, 2010 1:21 am**
by **luk4szz**

Hi I have some problems with LU decomposition in CULA BASIC, in my opinion counts wrong or my implementation is badly . Could someone help me see the error?

int main()

{

cudaError_t err;

culaStatus status;

// point to host memory

culaFloat* A = NULL;

culaInt* TAU = NULL;

int i;

// point to device memory

culaFloat* Ad = NULL;

culaInt* TAUd = NULL;

printf("Allocating Matrices\n");

A = (culaFloat*)malloc(N*N*sizeof(A[0]));

TAU = (culaInt*)malloc(N*sizeof(TAU[0]));

if(!A || !TAU)

exit(EXIT_FAILURE);

memset(A, 0, N*N*sizeof(A[0]));

memset(TAU, 0, N*sizeof(TAU[0]));

A[0]=3;

A[1]=3;

A[2]=0;

A[3]=0;

A[4]=2;

A[5]=2;

A[6]=1;

A[7]=0;

A[8]=1;

printf("\n\n");

for ( i=0; i<N*N;++i)

{

printf("%f, ",A[i]);

if ( (i+1) % N == 0 )

{

printf("\n");

}

}

/* Allocate device memory for the matrices */

err = cudaMalloc((void**)&Ad, N*N*sizeof(Ad[0]));

checkCudaError(err);

err = cudaMalloc((void**)&TAUd, N*sizeof(TAUd[0]));

checkCudaError(err);

printf("Initializing CULA\n");

status = culaInitialize();

checkStatus(status);

err = cudaMemcpy(Ad, A, N*N*sizeof(Ad[0]), cudaMemcpyHostToDevice);

checkCudaError(err);

err = cudaMemcpy(TAUd, TAU, N*sizeof(TAUd[0]), cudaMemcpyHostToDevice);

checkCudaError(err);

printf("Calling culaDeviceSgetrf\n");

status = culaDeviceSgetrf(N, N, Ad, N, TAUd);

checkStatus(status);

err = cudaMemcpy(A, Ad, N*N*sizeof(Ad[0]), cudaMemcpyDeviceToHost);

checkCudaError(err);

printf("\n\n");

for ( i=0; i<N*N;++i)

{

printf("%f, ",A[i]);

if ( (i+1) % N == 0 )

{

printf("\n");

}

}

err = cudaMemcpy(TAU, TAUd, N*sizeof(TAUd[0]), cudaMemcpyDeviceToHost);

checkCudaError(err);

printf("\n\n");

for ( i=0; i<N;++i)

{

printf("%d, ",TAU[i]);

if ( (i+1) % N == 0 )

{

printf("\n");

}

}

printf("Shutting down CULA\n");

culaShutdown();

free(A);

free(TAU);

cudaFree(Ad);

cudaFree(TAUd);

getchar();

return EXIT_SUCCESS;

}

### Re: LU decomposition in CULA

Posted:

**Tue Oct 19, 2010 1:28 am**
by **luk4szz**

I get

### Re: LU decomposition in CULA

Posted:

**Tue Oct 19, 2010 5:22 am**
by **kyle**

### Re: LU decomposition in CULA

Posted:

**Wed Nov 03, 2010 1:20 pm**
by **jayanta**

Hello to all,

I am a new user. I am using CULA from UBUNTU. I used the following function to do simple inverse of a matrix. Then, I need to use this inverse matrix for other purpose. But I am not sure how get/extract the inverse matrix. Because the function SGETRI does not returning any pointer. It is not taking any extra pointer even to store the output. It seems to me that the input matrix remains same after computation.

Thanks in advance.

-Jayanta

**************************the program I used for your reference ***************************

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <cula.h>

void checkStatus(culaStatus status)

{

if(!status)

return;

if(status == culaArgumentError)

printf("Invalid value for parameter %d\n", culaGetErrorInfo());

else if(status == culaDataError)

printf("Data error (%d)\n", culaGetErrorInfo());

else if(status == culaBlasError)

printf("Blas error (%d)\n", culaGetErrorInfo());

else if(status == culaRuntimeError)

printf("Runtime error (%d)\n", culaGetErrorInfo());

else

printf("%s\n", culaGetStatusString(status));

culaShutdown();

exit(EXIT_FAILURE);

}

int main() //int argc, char** argv

{

culaStatus status;

status = culaInitialize();

checkStatus(status);

/*

if (!culaInitialize())

{ printf("Error in initializing CULA\n");

return 1;

} */

int N=2,i;

culaFloat* A = NULL;

//culaFloat* B = NULL;

culaInt* TAU = NULL;

//culaFloat* TAU = NULL;

//float *B;

//B=(float*)malloc(N*N*sizeof(float));

A = (culaFloat*)malloc(N*N*sizeof(culaFloat));

//B = (culaFloat*)malloc(N*N*sizeof(culaFloat));

TAU = (culaInt*)malloc(N*sizeof(culaInt));

if(!A || !TAU)

exit(EXIT_FAILURE);

// memset(A, 0, N*N*sizeof(culaFloat));

//printf("type the elements:\n");

//for (i=0;i<N*N;i++) scanf("%f",A+i);

A[0] = 3;

A[1] = -4;

A[2] = -4;

A[3] = 6;

culaSgetri(N, A, N, TAU);

culaShutdown();

for (i=0;i<N*N;i++) printf("%f\t", *(A+i));

printf("\n\n");

free(A);

// free(B);

free(TAU);

return 0;

}

### Re: LU decomposition in CULA

Posted:

**Wed Nov 03, 2010 3:03 pm**
by **john**

### Re: LU decomposition in CULA

Posted:

**Wed Nov 03, 2010 3:19 pm**
by **jayanta**

Thanks a lot. It worked like magic.

### Re: LU decomposition in CULA

Posted:

**Tue Dec 14, 2010 4:58 am**
by **avatar**

Hi,

Does new version CULA R10 provide optimized LU decomposition for sparse matrices ?

In general, does CULA provide optimized sparse matrix-vector operations ?

Regards,

Sachin

### Re: LU decomposition in CULA

Posted:

**Wed Dec 15, 2010 2:04 pm**
by **john**

### Re: LU decomposition in CULA

Posted:

**Fri Jul 06, 2012 9:12 am**
by **luiceur**