Page 1 of 2

LU decomposition in CULA

PostPosted: Tue Aug 10, 2010 8:11 am
by athlonshi
Hi,

For a large sparse matrix, there are some tricks to accelerate LU decomposition.
I am wondering if the current LU decomposition solver DGETRF has already incorporate those tricks or it is totally based on the algorithm designed for dense matrix. If not, will it be considered to be the future feature? Thanks

Yu

Re: LU decomposition in CULA

PostPosted: Wed Aug 11, 2010 7:26 am
by athlonshi
I have noticed that there was a discussion regarding if the CULA package had implemented sparse matrix solvers in last year on this forum. The answer was no.

I am a CULA 2.0 premium user and I did see significant speed-up using CULA LU decomposition solver DGETRF compared to DGEFA in LAPACK. But since my matrix is a sparse one, I also used a sparse matrix solver to compare and found no advantage of using CULA.

If the solvers in CULA 2.0 are designed for general matrix operation, that means there are still a lot of room to further improve their performance, which is fantastic.

So, I just want to confirm that CULA 2.0 has not yet implemented any sparse matrix solver. And I hope this has been considered by CULA developers. Thanks!

Yu

Re: LU decomposition in CULA

PostPosted: Wed Aug 11, 2010 8:51 am
by kyle
You are correct in all your assumptions of the LU decomposition in CULA; the solver is designed for large dense matrices and does not consider sparseness. However, we are working on specialized sparse solvers (iterative and direct) that will be available in a later version of CULA.

Re: LU decomposition in CULA

PostPosted: Wed Aug 11, 2010 8:52 am
by kyle
Regarding your problem specifically, what sparse solver are you currently using?

Re: LU decomposition in CULA

PostPosted: Wed Aug 11, 2010 8:56 am
by athlonshi
Hi, Kyle:
Thanks for your reply!
I am not quite sure about the details of the sparse matrix solver that I am using.
It is integrated in a commercial software specifically for my research problem.
Yu

Re: LU decomposition in CULA

PostPosted: Wed Aug 11, 2010 11:45 am
by john
If you could PM myself and/or Kyle with the name of the commercial package, it might be helpful.

Re: LU decomposition in CULA

PostPosted: Tue Oct 19, 2010 1:21 am
by luk4szz
Hi I have some problems with LU decomposition in CULA BASIC, in my opinion counts wrong or my implementation is badly . Could someone help me see the error?

int main()
{

cudaError_t err;
culaStatus status;

// point to host memory
culaFloat* A = NULL;
culaInt* TAU = NULL;
int i;
// point to device memory
culaFloat* Ad = NULL;
culaInt* TAUd = NULL;

printf("Allocating Matrices\n");
A = (culaFloat*)malloc(N*N*sizeof(A[0]));
TAU = (culaInt*)malloc(N*sizeof(TAU[0]));
if(!A || !TAU)
exit(EXIT_FAILURE);

memset(A, 0, N*N*sizeof(A[0]));
memset(TAU, 0, N*sizeof(TAU[0]));
A[0]=3;
A[1]=3;
A[2]=0;
A[3]=0;
A[4]=2;
A[5]=2;
A[6]=1;
A[7]=0;
A[8]=1;

printf("\n\n");
for ( i=0; i<N*N;++i)
{
printf("%f, ",A[i]);
if ( (i+1) % N == 0 )
{
printf("\n");
}
}

/* Allocate device memory for the matrices */
err = cudaMalloc((void**)&Ad, N*N*sizeof(Ad[0]));
checkCudaError(err);

err = cudaMalloc((void**)&TAUd, N*sizeof(TAUd[0]));
checkCudaError(err);

printf("Initializing CULA\n");
status = culaInitialize();
checkStatus(status);

err = cudaMemcpy(Ad, A, N*N*sizeof(Ad[0]), cudaMemcpyHostToDevice);
checkCudaError(err);
err = cudaMemcpy(TAUd, TAU, N*sizeof(TAUd[0]), cudaMemcpyHostToDevice);
checkCudaError(err);

printf("Calling culaDeviceSgetrf\n");
status = culaDeviceSgetrf(N, N, Ad, N, TAUd);
checkStatus(status);

err = cudaMemcpy(A, Ad, N*N*sizeof(Ad[0]), cudaMemcpyDeviceToHost);
checkCudaError(err);
printf("\n\n");
for ( i=0; i<N*N;++i)
{
printf("%f, ",A[i]);
if ( (i+1) % N == 0 )
{
printf("\n");
}
}
err = cudaMemcpy(TAU, TAUd, N*sizeof(TAUd[0]), cudaMemcpyDeviceToHost);
checkCudaError(err);

printf("\n\n");
for ( i=0; i<N;++i)
{
printf("%d, ",TAU[i]);
if ( (i+1) % N == 0 )
{
printf("\n");
}
}
printf("Shutting down CULA\n");
culaShutdown();
free(A);
free(TAU);
cudaFree(Ad);
cudaFree(TAUd);
getchar();
return EXIT_SUCCESS;
}

Re: LU decomposition in CULA

PostPosted: Tue Oct 19, 2010 1:28 am
by luk4szz
I get

Re: LU decomposition in CULA

PostPosted: Tue Oct 19, 2010 5:22 am
by kyle
luk4szz wrote: Could someone help me see the error?


Is there an error? Remember that CULA expects your data to be column major, not row major like you are printing to the screen.

Re: LU decomposition in CULA

PostPosted: Wed Nov 03, 2010 1:20 pm
by jayanta
Hello to all,
I am a new user. I am using CULA from UBUNTU. I used the following function to do simple inverse of a matrix. Then, I need to use this inverse matrix for other purpose. But I am not sure how get/extract the inverse matrix. Because the function SGETRI does not returning any pointer. It is not taking any extra pointer even to store the output. It seems to me that the input matrix remains same after computation.
Thanks in advance.
-Jayanta
**************************the program I used for your reference ***************************
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <cula.h>


void checkStatus(culaStatus status)
{
if(!status)
return;

if(status == culaArgumentError)
printf("Invalid value for parameter %d\n", culaGetErrorInfo());
else if(status == culaDataError)
printf("Data error (%d)\n", culaGetErrorInfo());
else if(status == culaBlasError)
printf("Blas error (%d)\n", culaGetErrorInfo());
else if(status == culaRuntimeError)
printf("Runtime error (%d)\n", culaGetErrorInfo());
else
printf("%s\n", culaGetStatusString(status));

culaShutdown();
exit(EXIT_FAILURE);
}



int main() //int argc, char** argv
{
culaStatus status;
status = culaInitialize();
checkStatus(status);
/*
if (!culaInitialize())
{ printf("Error in initializing CULA\n");
return 1;
} */
int N=2,i;
culaFloat* A = NULL;
//culaFloat* B = NULL;
culaInt* TAU = NULL;
//culaFloat* TAU = NULL;
//float *B;
//B=(float*)malloc(N*N*sizeof(float));
A = (culaFloat*)malloc(N*N*sizeof(culaFloat));
//B = (culaFloat*)malloc(N*N*sizeof(culaFloat));
TAU = (culaInt*)malloc(N*sizeof(culaInt));


if(!A || !TAU)
exit(EXIT_FAILURE);

// memset(A, 0, N*N*sizeof(culaFloat));
//printf("type the elements:\n");
//for (i=0;i<N*N;i++) scanf("%f",A+i);
A[0] = 3;
A[1] = -4;
A[2] = -4;
A[3] = 6;
culaSgetri(N, A, N, TAU);
culaShutdown();
for (i=0;i<N*N;i++) printf("%f\t", *(A+i));
printf("\n\n");
free(A);
// free(B);
free(TAU);

return 0;

}

Re: LU decomposition in CULA

PostPosted: Wed Nov 03, 2010 3:03 pm
by john
getri inverts a matrix inplace, and also requires that you first call getrf on that matrix. Your program should read:

Code: Select all
getrf(N,N,A,N,ipiv);
getri(N,A,N,ipiv);


If you are using the inverse to solve a system A*x=b as inv(A)*b, then you should instead use the gesv function in CULA Basic or getrf (factorize) and getrs (backsolve).

Re: LU decomposition in CULA

PostPosted: Wed Nov 03, 2010 3:19 pm
by jayanta
Thanks a lot. It worked like magic.

Re: LU decomposition in CULA

PostPosted: Tue Dec 14, 2010 4:58 am
by avatar
Hi,
Does new version CULA R10 provide optimized LU decomposition for sparse matrices ?
In general, does CULA provide optimized sparse matrix-vector operations ?

Regards,
Sachin

Re: LU decomposition in CULA

PostPosted: Wed Dec 15, 2010 2:04 pm
by john
avatar wrote:Hi,
Does new version CULA R10 provide optimized LU decomposition for sparse matrices ?
In general, does CULA provide optimized sparse matrix-vector operations ?

Regards,
Sachin

CULA R10 does not provide this, except in the case of banded matrices. Sparse matrix operations will be featured in the coming year. For sparse matrix-vector, please see the CUSPARSE package in your CUDA toolkit.

Re: LU decomposition in CULA

PostPosted: Fri Jul 06, 2012 9:12 am
by luiceur
Do we need to use both calls
Code: Select all
getrf(N,N,A,N,ipiv);
getri(N,A,N,ipiv);


to get invert a matrix? Can we call culaDeviceSegetri with the free version?