Magma-2.5.0-rc1

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)
Post Reply
cyberwillis
Posts: 4
Joined: Fri Nov 23, 2018 8:59 pm

Magma-2.5.0-rc1

Post by cyberwillis » Fri Nov 23, 2018 9:19 pm

Hi guys,

I arrived with a very newbie question. I would like to know if the compute capability 3.0 will be deprecated for the next magma ?

I just saw the magma-2.5.0-rc1 was released and tried to take a look on it but I got a error mentioned bellow. A few weeks I build magma-2.4.0 successfully, but using the same process as before gave me this message:
calling a __global__ function("xxxxxxx") from a __device__ function("yyyyyyyy") is only allowed on the compute_35 architecture or above

Here is the code used to build that normally worked on magma-2.4.0

Code: Select all

export CUDADIR=/usr/local/cuda
export OPENBLASDIR=/opt/openblas
export GPU_TARGET="sm_30"
export MAGMA_NO_V1=ON
make -j12 all
Here is the error point, investigating I alse notice that this file wasn't there on the previous version:

Code: Select all

gcc -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c99 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -fPIC                       -fopenmp -o control/sizeptr control/sizeptr.o
touch control/sizeptr
===== static library testing/libtest.a
ar cr testing/libtest.a testing/magma_util.o testing/magma_zutil.o testing/magma_zgesvd_check.o testing/magma_generate.o testing/magma_cutil.o testing/magma_dutil.o testing/magma_sutil.o testing/magma_sgesvd_check.o testing/magma_dgesvd_check.o testing/magma_cgesvd_check.o
===== static library testing/lin/liblapacktest.a
ranlib testing/libtest.a
ar cr testing/lin/liblapacktest.a testing/lin/zbdt01.o testing/lin/zget22.o testing/lin/zhet21.o testing/lin/zhst01.o testing/lin/zunt01.o testing/lin/zqpt01.o testing/lin/zqrt02.o testing/lin/zlarfy.o testing/lin/zstt21.o testing/lin/cbdt01.o testing/lin/cget22.o testing/lin/chet21.o testing/lin/chst01.o testing/lin/cunt01.o testing/lin/cqpt01.o testing/lin/cqrt02.o testing/lin/clarfy.o testing/lin/cstt21.o testing/lin/dbdt01.o testing/lin/dget22.o testing/lin/dsyt21.o testing/lin/dhst01.o testing/lin/dort01.o testing/lin/dqpt01.o testing/lin/dqrt02.o testing/lin/dlarfy.o testing/lin/dstt21.o testing/lin/sbdt01.o testing/lin/sget22.o testing/lin/ssyt21.o testing/lin/shst01.o testing/lin/sort01.o testing/lin/sqpt01.o testing/lin/sqrt02.o testing/lin/slarfy.o testing/lin/sstt21.o
nvcc -O3         -DNDEBUG -DADD_       -Xcompiler "-fPIC" -std=c++11  -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -I/usr/local/cuda/include -I./include -I./control -I./magmablas -I./sparse/include -I./sparse/control -I./sparse/include -dc -o sparse/blas/magma_dsampleselect_core.o sparse/blas/magma_dsampleselect_core.cu
nvcc -O3         -DNDEBUG -DADD_       -Xcompiler "-fPIC" -std=c++11  -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -I/usr/local/cuda/include -I./include -I./control -I./magmablas -I./sparse/include -I./sparse/control -I./sparse/include -dc -o sparse/blas/magma_ssampleselect_core.o sparse/blas/magma_ssampleselect_core.cu
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zmcompressor.o sparse/testing/testing_zmcompressor.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zmconverter.o sparse/testing/testing_zmconverter.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zsort.o sparse/testing/testing_zsort.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zmatrixinfo.o sparse/testing/testing_zmatrixinfo.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zgetrowptr.o sparse/testing/testing_zgetrowptr.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zdot.o sparse/testing/testing_zdot.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zmdotc.o sparse/testing/testing_zmdotc.cpp
g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zspmv.o sparse/testing/testing_zspmv.cpp

g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zspmv_check.o sparse/testing/testing_zspmv_check.cpp
ranlib testing/lin/liblapacktest.a

g++ -O3 -fPIC -DNDEBUG -DADD_ -Wall -fopenmp -std=c++11 -DHAVE_CUBLAS -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I./include -I./sparse/include -I./sparse/control -I./testing -c -o sparse/testing/testing_zspmm.o sparse/testing/testing_zspmm.cpp
sparse/blas/magma_ssampleselect_core.cu(203): error: calling a __global__ function("select_bitonic_basecase") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(209): error: calling a __global__ function("build_searchtree") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(220): error: calling a __global__ function("count_buckets_write") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(221): error: calling a __global__ function("prefix_sum_counts") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(222): error: calling a __global__ function("sampleselect_findbucket") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(223): error: calling a __global__ function("collect_bucket_indirect") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_ssampleselect_core.cu(224): error: calling a __global__ function("sampleselect_tailcall") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

7 errors detected in the compilation of "/tmp/tmpxft_00000275_00000000-6_magma_ssampleselect_core.cpp1.ii".
Makefile:619: recipe for target 'sparse/blas/magma_ssampleselect_core.o' failed
make: *** [sparse/blas/magma_ssampleselect_core.o] Error 1
make: *** Waiting for unfinished jobs....
sparse/blas/magma_dsampleselect_core.cu(203): error: calling a __global__ function("select_bitonic_basecase") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(209): error: calling a __global__ function("build_searchtree") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(220): error: calling a __global__ function("count_buckets_write") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(221): error: calling a __global__ function("prefix_sum_counts") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(222): error: calling a __global__ function("sampleselect_findbucket") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(223): error: calling a __global__ function("collect_bucket_indirect") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above
sparse/blas/magma_dsampleselect_core.cu(224): error: calling a __global__ function("sampleselect_tailcall") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

7 errors detected in the compilation of "/tmp/tmpxft_00000272_00000000-6_magma_dsampleselect_core.cpp1.ii".
Makefile:619: recipe for target 'sparse/blas/magma_dsampleselect_core.o' failed
make: *** [sparse/blas/magma_dsampleselect_core.o] Error 1
I am just starting to learn the library.

mgates3
Posts: 842
Joined: Fri Jan 06, 2012 2:13 pm

Re: Magma-2.5.0-rc1

Post by mgates3 » Thu Nov 29, 2018 1:15 am

We're looking into it. I don't think removing 3.0 capability was the intention.
-mark

o438284
Posts: 1
Joined: Thu Nov 29, 2018 4:35 pm

Re: Magma-2.5.0-rc1

Post by o438284 » Thu Nov 29, 2018 4:39 pm

Hi,
could you also add 7.5 capability for Turing GPUs, please?

mgates3
Posts: 842
Joined: Fri Jan 06, 2012 2:13 pm

Re: Magma-2.5.0-rc1

Post by mgates3 » Thu Nov 29, 2018 4:58 pm

Done. It's just adding it as a flag in the Makefile.

MAGMA should always run on any CUDA arch >= what it was compiled for. So if you compile MAGMA for sm_30, it should also run on sm_35 (Kepler K40), sm_50 (Maxwell), ..., sm_70 (Volta), sm_75 (Turing), etc. Compiling for a specific architecture might, however, make it more efficient there.

-mark

cyberwillis
Posts: 4
Joined: Fri Nov 23, 2018 8:59 pm

Re: Magma-2.5.0-rc1

Post by cyberwillis » Wed Jan 09, 2019 3:55 pm

mgates3 wrote:
Thu Nov 29, 2018 1:15 am
We're looking into it. I don't think removing 3.0 capability was the intention.
-mark
The problem still exists on the 2.5.0 released

Code: Select all

nvcc -O3         -DNDEBUG -DADD_       -Xcompiler "-fPIC" -std=c++11  -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -I/usr/local/cuda/include -I./include -I./control -I./magmablas -I./sparse/include -I./sparse/control -I./sparse/include -dc -o sparse/blas/magma_dsampleselect_core.o sparse/blas/magma_dsampleselect_core.cu
sparse/blas/magma_dsampleselect_core.cu(203): error: calling a __global__ function("select_bitonic_basecase") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(209): error: calling a __global__ function("build_searchtree") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(220): error: calling a __global__ function("count_buckets_write") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(221): error: calling a __global__ function("prefix_sum_counts") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(222): error: calling a __global__ function("sampleselect_findbucket") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(223): error: calling a __global__ function("collect_bucket_indirect") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

sparse/blas/magma_dsampleselect_core.cu(224): error: calling a __global__ function("sampleselect_tailcall") from a __device__ function("launch_sampleselect") is only allowed on the compute_35 architecture or above

7 errors detected in the compilation of "/tmp/tmpxft_00004ed5_00000000-6_magma_dsampleselect_core.cpp1.ii".
Makefile:633: recipe for target 'sparse/blas/magma_dsampleselect_core.o' failed
make: *** [sparse/blas/magma_dsampleselect_core.o] Error 1


Post Reply