Cula Device interface in FORTRAN- device arrays

General CULA Dense (LAPACK & BLAS) support and troubleshooting. Use this forum if you are having a general problem or have encountered a bug.

Cula Device interface in FORTRAN- device arrays

Postby suzannepk » Tue Feb 07, 2012 4:28 pm


I am running CULA/R13a on yona a cluster with modules that allow CULA's full functionality. There are examples included for the device interface for C, but not for Fortran. So I am trying to modify the FORTRAN example given to use the device interface.

My attempt is attached:
(1.93 KiB) Downloaded 407 times

I am using pgi/10.11, cual/R13a, cuda/4.1. These are in modules on the cluster.

To be safe I have set the following environment variables:
setenv CULA_ROOT /sw/yona/cula/R13a/centos5.5_binary
setenv CULA_INC_PATH /sw/yona/cula/R13a/centos5.5_binary/include
setenv CULA_BIN_PATH_64 /sw/yona/cula/R13a/centos5.5_binary/bin64
setenv CULA_BIN_PATH_32 /sw/yona/cula/R13a/centos5.5_binary/bin
setenv CULA_LIB_PATH_64 /sw/yona/cula/R13a/centos5.5_binary/lib64
setenv CULA_LIB_PATH_32 /sw/yona/cula/R13a/centos5.5_binary/lib

I got the simple interface example to work fine with this configuration, but when I try to setup the device interface I cannot get the compiler to recognize the device array declarations. Or rather I get messages such as:
real,device,dimension(:,:),allocatable :: A_dev
Error: Syntax error in data declaration at (1)
In file fortranInterface.f:40

real,device,dimension(:),allocatable :: TAU_dev
Error: Syntax error in data declaration at (1)
In file fortranInterface.f:42

I based my work on this example in pgi-insider:

Which says
! allocate device memory
real, device, dimension(:,:), allocatable :: a_dev, b_dev
integer, device, dimension(:), allocatable :: ipiv_dev

allocate( a_dev(n,n), b_dev(n,nrhs), ipiv_dev(n) )

! copy input to device memory
a_dev = a

! call culapack solver (device memory)
status = cula_sgesv(n,nrhs,a_dev,n,ipiv_dev,b_dev,n)

! do more work here if desired

! copy output to host memory
b = b_dev

This forum has addressed this problem in this thread:

Where the user was given the same advice as that of the PGI insider example- but the post was never resolved.

Then I found this piece of advice in the CUAL forum in another thread addressing this problem:
john wrote:Given that you said you allocated in Fortran, but you're using the Device interface, I have to guess that the problem is that the Device interface requires GPU memory pointers. These must be allocated with cudaMalloc rather than Fortran's allocate.

So which is the correct way to handle the declaration of the device arrays? If it is with cudaMalloc rather than Fortran's allocate, is someone willing to show me an example of how this is done inside Fortran? Also is there anything special I need to add to the Makefile?

To the CULA developers-- Would you start including a device interface example for Fortran that is exactly analogous to C device interface example which comes with cula? Pretty Please :D !

Thank for any any help that anyone can offer,

Post-doc ORNL NCCS
Posts: 9
Joined: Tue Feb 07, 2012 2:16 pm

Re: Cula Device interface in FORTRAN- device arrays

Postby suzannepk » Wed Feb 08, 2012 9:38 am

So I got a little further using this PGI insider

My chief error was that the Makefile which comes with the CULA examples was using gfortran, not pgfortran, and Yona (the cluster) has a gfortran module loaded by default so my inability to notice the missing "g" was masked. Anyway gfortran apparently cannot recognize cuda_fortran extensions. True? Is pgfortran (meaning the pgi compiler version 9.2 and later) the only flavor of fortran that will?

Also I needed -Mcuda in my make file and I changed my program endings from *.f to *.cuf, but this may be over-kill since I am using the -Mcuda flag.
Here is the modified makefile:
# Makefile for fortranInterface example

include ../common/


LIBS=-lcula_core -lcula_lapack -lcula_lapack_fortran -lcublas -lcudart -Mcuda

@echo "To build this example, type one of:"
@echo ""
@echo " make build32"
@echo " make build64"
@echo ""
@echo "where '32' and '64' represent the platform you wish to build for"

sh ../
gpfortran -m32 -o fortranInterface $(MODULES) fortranInterface.cuf $(CFLAGS) $(INCLUDES) $(LIBPATH32) $(LIBS)

sh ../
pgfortran -m64 -o fortranInterface $(MODULES) fortranInterface.cuf $(CFLAGS) $(INCLUDES) $(LIBPATH64) $(LIBS)

rm -f fortranInterface
rm -f *.mod

So hurray the code compiles! However . . .

Initializing CULA
M= 2 N= 2 A= 0.000000 0.000000
0.000000 0.000000 TAU= 0.000000 0.000000
mpirun noticed that process rank 0 with PID 13088 on node yona16 exited on signal 11 (Segmentation fault).

Note I am still calling CULA_DEVICE_SGEQRF, I just did not change the print statement above to reflect this.

I'll post again if I figure this out in case my struggle will be of use to other CULA users.

Posts: 9
Joined: Tue Feb 07, 2012 2:16 pm

Re: Cula Device interface in FORTRAN- device arrays

Postby suzannepk » Wed Feb 08, 2012 1:37 pm

So it seems that my attempt at the conversion of the R13a fortran interface example to fortran device interface will not run with out a segmentation fault in the cula call. However. I can get the example from the PGI insider (this one to work for host and device interfaces if I drop back to CULA/R10. This is probably because that source code was written in 2010 assuming all the versions of the functions in R10.

I am still looking for any advice toward achieving a working example of the Fortran device interface for CULA/R13a or R14.

Here was my last attempt:
Code: Select all
!* CULA Example: Fortran Interface (now with device)
!* Note: this example performs the QR factorization on a matrix of zeros, the
!* result of which is a matrix of zeros, due to the omission of ones across the
!* diagonal of the upper-diagonal unitary Q.

        use CULA
        use cudafor



        INTRINSIC          MAX, MIN

        !PARAMETER ( M = 8192, N = 8192, K = 8192 )
        PARAMETER ( M = 2, N = 2, K = 2 )

        REAL A(M, N)
        REAL TAU(K)
! gpu memory
        !real,device,dimension(:,:),allocatable :: A_dev
        !real,device,dimension(:),allocatable :: TAU_dev
        real,device :: Adev(M,N)
        real,device :: TAU_dev(N)
!allocate Device memory
       ! Allocate(A_dev(M,N),TAU_dev(K))

        WRITE(*,*) 'Initializing CULA'
        write(*,*) "M=",M,"N=",N,"A=",A,"TAU=",TAU
        write(*,*) "M=",M,"N=",N,"A_dev=",A,"TAU_dev=",TAU
        WRITE(*,*) 'Calling CULA_SGEQRF'
        !STATUS = CULA_SGEQRF(M, N, A, M, TAU)
        !STATUS = CULA_SGEQRF(M, N, A_dev, M, TAU_dev)
        STATUS=CULA_DEVICE_SGEQRF(M, N, A_dev, M, TAU_dev)
        !STATUS==CULA_DEVICE_SGEQRFP(int* m, int* n, devptr_t* a, int* lda, devptr_t* tau)
        Write(*,*) "I got here"

        write(*,*) "M=",M,"N=",N,"A=",A,"TAU=",TAU
        WRITE(*,*) 'Shutting down CULA'

The makefile is the same as in my last post.

So I will work in R10 and hope for some help in the mean time.

Posts: 9
Joined: Tue Feb 07, 2012 2:16 pm

Re: Cula Device interface in FORTRAN- device arrays

Postby john » Mon Feb 13, 2012 3:31 pm

We are looking into this presently. We expect to respond in one or two days.
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Re: Cula Device interface in FORTRAN- device arrays

Postby john » Thu Mar 01, 2012 11:42 am

Dear Suzanne,
I wanted to inform you that we have performed a full refit of our Fortran capabilities, which will be released in R15. We will also be doing a few new writeups on our blog and possibly in the Portland Group newsletter in the coming days. I have sent along an update package to the Yona system administrator, which I am hoping he will distribute to all the users on that system. This update includes new module files and several new examples that I have tested against a number of compilers, including both gfortran and pgfortran.
Posts: 587
Joined: Thu Jul 23, 2009 2:31 pm

Return to CULA Dense Support

Who is online

Users browsing this forum: No registered users and 1 guest