Page 1 of 1

COMPLEX GETTRI

PostPosted: Wed Aug 03, 2011 1:09 pm
by apardo
Hi! In the middle of my program i have to do an inverse of complex matrix so i do the following:

culaInt *ipiv1;
culaGetrf(DIMENSION2, DIMENSION2, mtx_w11cula, DIMENSION2,ipiv1);
culaGetri(DIMENSION2, mtx_w11cula, DIMENSION2, ipiv1);

But this closes the program in MS-DOS (the typical window problem)

mtx_w11cula is this:

-0.0105 -0.0050 0.0016 - 0.0083i 0.0016 + 0.0083i
0.0024 0.0132 0.0029 - 0.0044i 0.0029 + 0.0044i
0.0260 -0.0087 -0.0001 + 0.0082i -0.0001 - 0.0082i
0.0253 0.0019 0.0052 + 0.0046i 0.0052 - 0.0046i

If i put this in matlab and make the inverse i get:

1.0e+002 *

0.2014 0.6879 0.8301 -0.4402
-0.4653 0.5618 -0.0664 -0.1786
0.1017 + 0.5684i -1.0620 + 0.8011i -1.3393 + 0.7536i 1.5199 - 0.6162i
0.1017 - 0.5684i -1.0620 - 0.8011i -1.3393 - 0.7536i 1.5199 + 0.6162i

So, clearly it must be performed in the GPU... I have cuda 4.0 and cula r12...
Where is the problem? Thanks for your help!

Re: COMPLEX GETTRI

PostPosted: Wed Aug 03, 2011 1:38 pm
by kyle
A number of things could be going wrong...

Is your data column major? Are you checking culaStatus for errors?

Re: COMPLEX GETTRI

PostPosted: Thu Aug 04, 2011 1:17 pm
by apardo
Yes, it is column major order. Now i tried to alocate ipiv because i was not doing it.... it looks like this:

culaDeviceInt* ipiv0;
culaInt *ipiv1;
culaInt *ipiv2;

ipiv0 = (culaDeviceInt*)malloc(DIMENSION2*sizeof(culaDeviceInt));
ipiv1 = (culaInt*)malloc(DIMENSION2*sizeof(culaInt));
ipiv2 = (culaInt*)malloc(DIMENSION1*sizeof(culaInt));


status = culaDeviceSgetrf(DIMENSION2, DIMENSION2, mtx_ainv, DIMENSION2,ipiv0);
status = culaDeviceSgetri(DIMENSION2, mtx_ainv, DIMENSION2, ipiv0);

That is for the first inversion... But now the ipiv0 has, as an output, all zeros...

I think that i am alocating it well but i am not sure why does it happens...

Re: COMPLEX GETTRI

PostPosted: Thu Aug 04, 2011 3:51 pm
by kyle
Device data has to be initialized with cudaMalloc().

Please read the Programmer's Guide documentation for more information; this is covered in detail there.

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 11:32 am
by apardo
Ahhh, thanks. That was my problem... Another question: how do i copy a matrix to, for example, the real part of a complex matrix? All in device memory... Just with cudaMemcpy?

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 11:45 am
by kyle
The complex data is stored as an array of structures (AoS) - the data is interleaved.

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 11:53 am
by apardo
So... Do i have to copy it manually?

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 12:23 pm
by kyle
If it's on the GPU, yes.

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 12:30 pm
by apardo
Ohhh, i was thinking about a function that copies DevicetoDevice with complex... what a pity... Thank you!!!

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 1:02 pm
by kyle
I'd recommend reading the CUDA Programmer's Guide (from NVIDIA) if you are going to use the device interface. We assume users who are using this interface (opposed to the normal host interface) are familiar with the CUDA programming model.

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 1:25 pm
by apardo
Ok! I am familiar to CUDA but I had doubts about CULA because there are almost no places to look about it!

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 2:44 pm
by john
I would recommend reading through our examples folder very carefully. It shows many different usage patterns you can use.

Re: COMPLEX GETTRI

PostPosted: Fri Aug 05, 2011 3:12 pm
by apardo
Ok, i'll take your advice. Thanks for your help!!!