memory leak in culaDeviceSsyev

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

memory leak in culaDeviceSsyev

Postby huangchbii » Sat Jul 16, 2011 5:50 am

Hi.

I've found that when I am using CULA R11 with WIN7 32bit, VS2008, the culaDeviceSsyev leaks the host memory (NOT video memory). In my program, I have to compute the eigen values/vectors frequently. As a result, I ran out of the host memory right the way. The part of the code is as the following:

float *dw;
float *dE;
CUDA_CALL(cudaMalloc((void**)&dw, N*sizeof(float)));
CUDA_CALL(cudaMalloc((void**)&dE, N*N*sizeof(float)));
CUDA_CALL(cudaMemcpy(dE, dA, N*N*sizeof(float), cudaMemcpyDeviceToDevice));
CULA_CALL(culaDeviceSsyev('V', 'L', N, dE, N, dw));

Please advise.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby john » Mon Jul 18, 2011 12:22 pm

Hello, we would like to investigate, but will need a complete program that demonstrates this behavior. SYEV's operations can vary dramatically based on the data and the size of N.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Tue Jul 19, 2011 3:28 am

Hi. It is great that someone is lookng at this problem.
I cannt give you the package that we are working on now due to the confidential issues. However, I will probably able to give you one exmple which shows how to reproduce it.

By the way, this problem happened in cuda 3.2 + cula-r11. I think it also can happen when cuda4.0 + cula-r12. However, My configuration for cuda-4.0 and cula-r12 is different to the official suggetstion from NVIDIA and CULA thus I cannt confirm it.

Also,I believe this problem can also happened on ubuntu 10.04 64bit + cuda 4.0. I am still looking at it.

Please let me know how can I send you the test code.

Sincerely yours,
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Wed Jul 20, 2011 2:32 am

Hi I would like to update the status.

I've found the memory leak on a win7 32bits cuda-3.2 cula-r11 laptop, in which the video card shared the memory with the host.

The situation remains on a win7 32bits cuda-4.0.13 cula-r12 (in which I made copies of DLLs in order to fulfill the DLL version requirement of cula-r12).

Today I recompiled my code for a ubuntu 10.04 64bit cuda-4.0.17 cula-r12. The host memory leak is disappeared. But, within each iteration, I lost about 1 MB gpu memory. Sometimes these disappeared memory are released. However, the memory leak is increased smoothly (about 1 MB in each iteration).

On the ubuntu machine, I have two GPU cards and 1GB memory on each cards. On my laptop, the video cards has only about 256MB. This makes a huge different.

I guess it is a sort of cache mechanism inside of cula?
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby john » Wed Jul 20, 2011 5:42 am

Yes, there is some memory caching present in CULA. The cache will reduce itself periodically, or you can request it manually with culaFreeBuffers()
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Wed Jul 20, 2011 10:19 am

Hi,

I did test the culaFreeBuffers(). In the fact, I called the initialize and shutdown in each iteration as the following:

int initialize() {
cublasInit();
culaInitialize();
return EXIT_SUCCESS;
}

int shutdown() {
culaFreeBuffers();
culaShutdown();
cublasShutdown();
return EXIT_SUCCESS;
}

However, the memory leak is still there...I think either the culaFreeBuffers() or the culaShutdown() didn't release the memory right the way.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby john » Thu Jul 21, 2011 5:56 am

Ok then, with that tested then we will need to request from you a test example which exhibits this behavior. Please post here; you can send it to me via forum PM if you would like to keep it hidden.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Fri Jul 22, 2011 8:02 am

Hi, These piece of code will eat all of the memory (on my Macbook it is the host memory. I think on standalone GPU card, it will be GPU memory)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CULA_USE_CUDA_COMPLEX
#include <culapackdevice.h>

#include <cuda_runtime.h>
#ifdef _MSC_VER
# pragma comment(lib, "cudart.lib")
#endif


void checkStatus(culaStatus status)
{
char buf[80];

if(!status)
return;

culaGetErrorInfoString(status, culaGetErrorInfo(), buf, sizeof(buf));
printf("%s\n", buf);

culaShutdown();
exit(EXIT_FAILURE);
}


void checkCudaError(cudaError_t err)
{
if(!err)
return;

printf("%s\n", cudaGetErrorString(err));

culaShutdown();
exit(EXIT_FAILURE);
}


int main(int argc, char** argv)
{
#ifdef NDEBUG
int M = 8192;
#else
int M = 1024;
#endif
int N = M;

int i;

cudaError_t err;
culaStatus status;

// point to host memory
float* A = NULL;
float* TAU = NULL;

// point to device memory
float* Ad = NULL;
float* TAUd = NULL;

printf("Allocating Matrices\n");
A = (float*)malloc(M*N*sizeof(float));
TAU = (float*)malloc(N*sizeof(float));
if(!A || !TAU)
exit(EXIT_FAILURE);

err = cudaMalloc((void**)&Ad, M*N*sizeof(float));
checkCudaError(err);

err = cudaMalloc((void**)&TAUd, N*sizeof(float));
checkCudaError(err);

printf("Initializing CULA\n");
status = culaInitialize();
checkStatus(status);

memset(A, 0, M*N*sizeof(float));
err = cudaMemcpy(Ad, A, M*N*sizeof(float), cudaMemcpyHostToDevice);
checkCudaError(err);

// printf("Calling culaDeviceSgeqrf\n");
// status = culaDeviceSgeqrf(M, N, Ad, M, TAUd);

for(i = 0; i < 10000; i ++) {
printf("Calling culaDeviceSsyev: %d\n", i);
status = culaDeviceSsyev('V', 'L', N, Ad, N, TAUd);
culaFreeBuffers();
checkStatus(status);
}

err = cudaMemcpy(A, Ad, M*N*sizeof(float), cudaMemcpyDeviceToHost);
checkCudaError(err);
err = cudaMemcpy(TAU, TAUd, N*sizeof(float), cudaMemcpyDeviceToHost);
checkCudaError(err);

printf("Shutting down CULA\n");
culaShutdown();

cudaFree(Ad);
cudaFree(TAUd);
free(A);
free(TAU);

return EXIT_SUCCESS;
}
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby john » Mon Jul 25, 2011 2:54 pm

I have run this program to completion (10000 SYEV calls) now on two platforms:
* Linux 64, C2050
* Mac OSX 10.6 (64-bit) on a Macbook Pro (9600 GPU)

Both machines are CUDA 4 and CULA R12.

In neither case do I observe a leak; on the Mac the usage sits at 56.7 MB of allocated RAM for the full duration. Hopefully we can find a configuration that exhibits the error.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Tue Jul 26, 2011 6:12 am

Oh this example is not good enough. Later on I will give you another.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Tue Jul 26, 2011 6:42 am

Hi I will PM you with an example.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Tue Jul 26, 2011 7:03 am

Ok new thing I just discovered is that when using cuda 4 with cula-r12 (win7 32bit), the machine won't crash. But it still ran out of the memory.

In cuda 3 with cula r11, it just crash my computer. I think the cuda driver has some thing to do in this.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am

Re: memory leak in culaDeviceSsyev

Postby huangchbii » Tue Jul 26, 2011 7:06 am

Also, I've found it somehow is related to the loop. For example, if you set M to 16, it will run faster, also, it will also consume the memory faster.
huangchbii
 
Posts: 15
Joined: Wed Jul 07, 2010 1:27 am


Return to CULA Dense Support

Who is online

Users browsing this forum: Google [Bot] and 1 guest