CULA compile problem with gpu function

Support for issues specific to the Mac OS operating systems.

CULA compile problem with gpu function

Postby steve90370 » Sat Oct 02, 2010 10:12 am

Hello all:

I find that CULA is a very powerful tool for solving matrix problem. And I try to implement a LU solver calculated by GPU on MAC platform. I try to combine GPU matrix transpose function(which is a GPU kernel function) with CULA culaDeviceSgetrs. But it will response error while compiling.

Below are my makefile:
--------------------------------------
CC= nvcc
CFLAGS=-DNDEBUG -O3
INCLUDES=-I${CULA_INC_PATH} -I${CUDA_BIN_PATH}
LIBPATH32=-L${CULA_LIB_PATH_32}
LIBPATH64=-L${CULA_LIB_PATH_64}

LIBS= -lcula -lcublas -lcudart

usage:
@echo "To build this example, type one of:"
@echo ""
@echo " make build32"
@echo " make build64"
@echo ""
@echo "where '32' and '64' represent the platform you wish to build for"
@echo ""
@echo "Note: this example requires the CUDA toolkit to compile"
build32:
${CC} -m32 -v -o getrf getrf.cu $(CFLAGS) $(INCLUDES) $(LIBPATH32) $ (LIBS)
build64:
sh ../checkenvironment.sh
${CC} -m64 -o geqrf_device geqrf_device.c $(CFLAGS) $(INCLUDES) $(LI BPATH64) $(LIBS)

clean:
rm -f getrf
-----------------------------------

Here are the error message and compile log:

nvcc -m32 -v -o getrf getrf.cu -DNDEBUG -O3 -I/usr/local/cula/include -I/usr/local/cuda/bin: -L/usr/local/cula/lib -lcula -lcublas -lcudart
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/local/cuda/bin
#$ _THERE_=/usr/local/cuda/bin
#$ _TARGET_SIZE_=
#$ TOP=/usr/local/cuda/bin/..
#$ PATH=/usr/local/cuda/bin/../open64/bin:/usr/local/cuda/bin:/usr/local/cuda/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
#$ INCLUDES="-I/usr/local/cuda/bin/../include"
#$ LIBRARIES= "-L/usr/local/cuda/bin/../lib" -lcudart
#$ CUDAFE_FLAGS=
#$ OPENCC_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -D__CUDA_ARCH__=100 -E -x c++ -DCUDA_FLOAT_MATH_FUNCTIONS -DCUDA_NO_SM_12_ATOMIC_INTRINSICS -DCUDA_NO_SM_13_DOUBLE_INTRINSICS -DCUDA_NO_SM_11_ATOMIC_INTRINSICS "-I/usr/local/cuda/bin/../include" -I. -D__CUDACC__ -C -O3 -I"/usr/local/cula/include" -I"/usr/local/cuda/bin:" -D"NDEBUG" -include "cuda_runtime.h" -m32 -malign-double -o "/tmp/tmpxft_00002f97_00000000-4_getrf.cpp1.ii" "getrf.cu"
#$ cudafe --m32 --gnu_version=40201 -tused --no_remove_unneeded_entities --gen_c_file_name "/tmp/tmpxft_00002f97_00000000-1_getrf.cudafe1.c" --stub_file_name "/tmp/tmpxft_00002f97_00000000-1_getrf.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_00002f97_00000000-1_getrf.cudafe1.gpu" --include_file_name "/tmp/tmpxft_00002f97_00000000-3_getrf.fatbin.c" "/tmp/tmpxft_00002f97_00000000-4_getrf.cpp1.ii"
getrf.cu(248): error: identifier "culaDeviceSgetrf" is undefined

getrf.cu(198): warning: variable "t_A" was declared but never referenced

1 error detected in the compilation of "/tmp/tmpxft_00002f97_00000000-4_getrf.cpp1.ii".
# --error 0x2 --
make: *** [build32] Error 2

Please help! Thank you very much!
steve90370
 
Posts: 2
Joined: Thu Sep 30, 2010 11:04 pm

Re: CULA compile problem with gpu function

Postby john » Tue Oct 05, 2010 6:49 am

My guess is that you haven't #include "cula.h" because your compiler displayed the getrf.cu(248): error: identifier "culaDeviceSgetrf" is undefined message; basically it can't find the declaration, which indicates a missing header.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: CULA compile problem with gpu function

Postby steve90370 » Wed Oct 06, 2010 11:40 pm

Thank you so much! My problem solved!
By the way, I have some questions about CULA tool:
Is all the CULA standard automatically open space on device memory and work on GPU?

And for the all the input matrix, are they all stored in column major order?

If yes, I have to do matrix transpose first before applying the cula function?

And also another question about the precision:

I use culaDeviceSgesv to calculate the solution of Ax = b. After calculation I calculate the difference between b and A*X. And I found that the result precision is not very well... Some difference is almost close to 1.0f! Is this because the limited precision of CULA basic?

Thanks!
steve90370
 
Posts: 2
Joined: Thu Sep 30, 2010 11:04 pm

Re: CULA compile problem with gpu function

Postby john » Fri Oct 08, 2010 5:53 am

CULA matrices are in column major storage, so you might need to transpose. Fortunately, transpose is very simple (see the CUDA SDK.)

Did you compare your results to a software LAPACK? Our results will match theirs very closely, even in single precision. You didn't specify how you measured your error, so it is hard to describe further. If you are seeing large errors, then you might be calling the routine incorrectly as well.
john
Administrator
 
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm


Return to Mac OS X Support

Who is online

Users browsing this forum: No registered users and 0 guests

cron