something wrong with cula_crossover using zgemm
3 posts
• Page 1 of 1
something wrong with cula_crossover using zgemm
I have meet a problem when using cula_crossovers.lst
I add the two lines "export CULA_CROSSOVER_LIST=cula_crossovers.lst
export CULA_VERBOSE=1"
in my .bashrc file
In my cula_crossovers.lst file, there is only one line "zgemm=40000000"
Here is my test output:
"M= 4000 K= 4000 N= 4000
cula info: parsing crossover list 'cula_crossovers.lst'
cula info: setting crossover for zgemm to '40000000'
cula info: zgemm (N, N, 4000, 4000, 4000, (1.000000,0.000000), 0x7fe207177010, 4000, 0x7fe1f7d52010, 4000, (0.000000,0.000000), 0x7fe1c26d0010, 4000)
cula info: issuing to CPU (under threshold)
cula info: CPU library is lapackcpu.so
cula info: done
the difference between zgemm cu_zgemm : 0.2046607344315598E-10
zgemm time : 5.87 cu_zgemm time : 1.81
M= 5000 K= 4000 N= 5000
cula info: zgemm (N, N, 5000, 5000, 4000, (1.000000,0.000000), 0x7fe20346e010, 5000, 0x7fe1f0340010, 4000, (0.000000,0.000000), 0x7fe174287010, 5000)
cula info: issuing to GPU (over threshold)
cula info: done
the difference between zgemm cu_zgemm : 0.0000000000000000E+00
zgemm time : 2.76 cu_zgemm time : 2.76
M= 6000 K= 4000 N= 6000
cula info: zgemm (N, N, 6000, 6000, 4000, (1.000000,0.000000), 0x7fe1ff765010, 6000, 0x7fe1e892e010, 4000, (0.000000,0.000000), 0x7fe158805010, 6000)
cula info: issuing to CPU (under threshold)
cula info: done
the difference between zgemm cu_zgemm : 0.2228434323826737E-10
zgemm time : 9.69 cu_zgemm time : 3.95
"
It seems that my setting in cula_crossoveers does not work, because 4000*4000*4000=16000000000 is grater than 40000000.
Why? Is there something wrong with my settings? or other things that I am not awared?
I add the two lines "export CULA_CROSSOVER_LIST=cula_crossovers.lst
export CULA_VERBOSE=1"
in my .bashrc file
- Code: Select all
call ZGEMM('N','N',ndim,ndim,mdim,ALPHA, A,ndim,B,mdim,BETA,F,ndim)
In my cula_crossovers.lst file, there is only one line "zgemm=40000000"
Here is my test output:
"M= 4000 K= 4000 N= 4000
cula info: parsing crossover list 'cula_crossovers.lst'
cula info: setting crossover for zgemm to '40000000'
cula info: zgemm (N, N, 4000, 4000, 4000, (1.000000,0.000000), 0x7fe207177010, 4000, 0x7fe1f7d52010, 4000, (0.000000,0.000000), 0x7fe1c26d0010, 4000)
cula info: issuing to CPU (under threshold)
cula info: CPU library is lapackcpu.so
cula info: done
the difference between zgemm cu_zgemm : 0.2046607344315598E-10
zgemm time : 5.87 cu_zgemm time : 1.81
M= 5000 K= 4000 N= 5000
cula info: zgemm (N, N, 5000, 5000, 4000, (1.000000,0.000000), 0x7fe20346e010, 5000, 0x7fe1f0340010, 4000, (0.000000,0.000000), 0x7fe174287010, 5000)
cula info: issuing to GPU (over threshold)
cula info: done
the difference between zgemm cu_zgemm : 0.0000000000000000E+00
zgemm time : 2.76 cu_zgemm time : 2.76
M= 6000 K= 4000 N= 6000
cula info: zgemm (N, N, 6000, 6000, 4000, (1.000000,0.000000), 0x7fe1ff765010, 6000, 0x7fe1e892e010, 4000, (0.000000,0.000000), 0x7fe158805010, 6000)
cula info: issuing to CPU (under threshold)
cula info: done
the difference between zgemm cu_zgemm : 0.2228434323826737E-10
zgemm time : 9.69 cu_zgemm time : 3.95
"
It seems that my setting in cula_crossoveers does not work, because 4000*4000*4000=16000000000 is grater than 40000000.
Why? Is there something wrong with my settings? or other things that I am not awared?
- wuquansheng
- Posts: 3
- Joined: Thu Mar 17, 2011 10:16 am
Re: something wrong with cula_crossover using zgemm
It appears we overflowed the limit of a 32-bit integer internally on that one. Thank you for your report, we'll need to get this one fixed. I'm at a loss right now to suggest a workaround besides setting GEMM to always run on the GPU.
- john
- Administrator
- Posts: 568
- Joined: Thu Jul 23, 2009 2:31 pm
Re: something wrong with cula_crossover using zgemm
Thank you for your reply, I hope this bug can be fixed in the next version.
If I set CULA_GPU_ALWAYS=1, my program will be very slow, so I wrote another version of zgemm using cublas.
I hope the next version comes out as soon as possible
If I set CULA_GPU_ALWAYS=1, my program will be very slow, so I wrote another version of zgemm using cublas.
I hope the next version comes out as soon as possible
- wuquansheng
- Posts: 3
- Joined: Thu Mar 17, 2011 10:16 am
3 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest
