Page 1 of 1


PostPosted: Wed Jul 25, 2012 8:43 pm
by gusgw

I'm trying to use the zheev function of CULA R12 in mpi processes. I'm finding an error however, when I call the function, which gives SIGABRT:

forrtl: error (76): Abort trap signal
Image PC Routine Line Source 00000036AAA30265 Unknown Unknown Unknown 00000036AAA31D10 Unknown Unknown Unknown 00000036AB60639E Unknown Unknown Unknown 00000036AB60655D Unknown Unknown Unknown 00000036AAAD3C2D Unknown Unknown Unknown

running with idb gives a little extra info:

Program received signal SIGABRT
raise () in /lib64/

and a backtrace:

#0 0x00000036aaa30265 in raise () in /lib64/
#1 0x00000036aaa31d10 in abort () in /lib64/
#2 0x00000036ab60639e in __deallocate_stack () in /lib64/
#3 0x0000000045e409c0 in ?? ()
#4 0x0000428e00000000 in ?? ()
#5 0x000000004543f9e0 in ?? ()
#6 0x000000004543f9e0 in ?? ()
#7 0xffffffffffffffe0 in ?? ()

This error doesn't occur on versions of the codes without mpi however running the mpi code on it's own (i.e. not using mpirun) still fails in this manner. Is there some conflict with the OpenMPI and CULA libs?

CUDA 4.00
OpenMPI 1.4.5
Intel 2011.9
Red Hat Linux

Any ideas?