Memory leak in magma_sgetri_outofplace_batched

Open discussion for MAGMA library (Matrix Algebra on GPU and Multicore Architectures)

Memory leak in magma_sgetri_outofplace_batched

Postby linuxfreak » Thu Nov 03, 2016 2:36 pm

Dear magma developers,

I found there is a memory leak in magma_sgetri_outofplace_batched in 2.1.0 version.
I have modified the testing_sgetri_batched to include cudaDeviceReset(); after TESTING_CHECK( magma_finalize() ); (needed for leak-check)

Then I got the following:
$ cuda-memcheck --leak-check full ./testing_sgetri_batched -N 32
========= CUDA-MEMCHECK
% MAGMA 2.1.0 compiled for CUDA capability >= 2.0, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7050. OpenMP threads 4.
% device 0: GeForce GTX 680, 1137.0 MHz clock, 4092.0 MB memory, capability 3.0
% Fri Nov 4 02:44:30 2016
% Usage: ./testing_sgetri_batched [options] [-h|--help]

% batchCount N CPU Gflop/s (ms) GPU Gflop/s (ms) ||I - A*A^{-1}||_1 / (N*cond(A))
%===============================================================================
300 32 --- ( --- ) 0.25 ( 102.33)
========= Leaked 2400 bytes at 0x502c03400
========= Saved host backtrace up to driver entry point at cudaMalloc time
========= Host Frame:/usr/lib64/libcuda.so.1 (cuMemAlloc_v2 + 0x17f) [0x13dc4f]
========= Host Frame:/usr/local/cuda/lib64/libcudart.so.7.0 [0x2b423]
========= Host Frame:/usr/local/cuda/lib64/libcudart.so.7.0 [0xe78b]
========= Host Frame:/usr/local/cuda/lib64/libcudart.so.7.0 (cudaMalloc + 0x16f) [0x3b71f]
========= Host Frame:./testing_sgetri_batched (magma_malloc + 0x15) [0x107f5]
========= Host Frame:./testing_sgetri_batched (magma_sgetri_outofplace_batched + 0x133) [0x14363]
========= Host Frame:./testing_sgetri_batched (main + 0x64a) [0xa87a]
========= Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xed) [0x2135d]
========= Host Frame:./testing_sgetri_batched [0xcf69]
=========
========= LEAK SUMMARY: 2400 bytes leaked in 1 allocations
========= ERROR SUMMARY: 0 errors

In my own software, where I am doing batched matrix inversion with MAGMA, I get the following:
Processing frame 1000 GPU mem: 92.78 MB of 4092.03 MB Time for 1000 frames: 25.19 s
Processing frame 2000 GPU mem: 97.78 MB of 4092.03 MB Time for 1000 frames: 25.12 s
Processing frame 3000 GPU mem: 101.78 MB of 4092.03 MB Time for 1000 frames: 25.98 s
Processing frame 4000 GPU mem: 106.78 MB of 4092.03 MB Time for 1000 frames: 26.30 s
Processing frame 5000 GPU mem: 111.78 MB of 4092.03 MB Time for 1000 frames: 26.44 s
Processing frame 6000 GPU mem: 116.78 MB of 4092.03 MB Time for 1000 frames: 26.53 s
Processing frame 7000 GPU mem: 120.78 MB of 4092.03 MB Time for 1000 frames: 26.82 s
Processing frame 8000 GPU mem: 125.78 MB of 4092.03 MB Time for 1000 frames: 27.16 s
Processing frame 9000 GPU mem: 131.78 MB of 4092.03 MB Time for 1000 frames: 27.39 s
Processing frame 10000 GPU mem: 136.78 MB of 4092.03 MB Time for 1000 frames: 27.75 s
Processing frame 11000 GPU mem: 141.78 MB of 4092.03 MB Time for 1000 frames: 27.82 s
Processing frame 12000 GPU mem: 146.78 MB of 4092.03 MB Time for 1000 frames: 28.01 s
Processing frame 13000 GPU mem: 151.78 MB of 4092.03 MB Time for 1000 frames: 28.41 s
Processing frame 14000 GPU mem: 155.78 MB of 4092.03 MB Time for 1000 frames: 28.84 s
Processing frame 15000 GPU mem: 160.78 MB of 4092.03 MB Time for 1000 frames: 29.29 s
Processing frame 16000 GPU mem: 165.78 MB of 4092.03 MB Time for 1000 frames: 29.69 s
Processing frame 17000 GPU mem: 170.79 MB of 4092.03 MB Time for 1000 frames: 29.89 s
Processing frame 18000 GPU mem: 175.79 MB of 4092.03 MB Time for 1000 frames: 30.10 s
Processing frame 19000 GPU mem: 180.79 MB of 4092.03 MB Time for 1000 frames: 30.50 s
Processing frame 20000 GPU mem: 185.79 MB of 4092.03 MB Time for 1000 frames: 30.88 s

So not only the memory leaks, but also the frame processing time slows down. When I comment out sgetri, I get the following:
Processing frame 1000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.23 s
Processing frame 2000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 22.70 s
Processing frame 3000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.30 s
Processing frame 4000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.32 s
Processing frame 5000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.35 s
Processing frame 6000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.39 s
Processing frame 7000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.38 s
Processing frame 8000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 23.38 s
Processing frame 9000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.27 s
Processing frame 10000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 11000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 12000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 13000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 14000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 15000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.25 s
Processing frame 16000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.26 s
Processing frame 17000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.34 s
Processing frame 18000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.40 s
Processing frame 19000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.40 s
Processing frame 20000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 23.40 s
linuxfreak
 
Posts: 4
Joined: Wed Nov 02, 2016 4:09 pm

Re: Memory leak in magma_sgetri_outofplace_batched

Postby linuxfreak » Thu Nov 03, 2016 4:05 pm

Well, the memory leak is in here:
magma-2.1.0/src/sgetri_outofplace_batched.cpp
I added two lines:
140d139
< magma_free(dW0_displ);
210d208
< magma_free(dW0_displ);
And it appears to work:
Processing frame 1000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.99 s
Processing frame 2000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.39 s
Processing frame 3000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.03 s
Processing frame 4000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.06 s
Processing frame 5000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.07 s
Processing frame 6000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.09 s
Processing frame 7000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.08 s
Processing frame 8000 GPU mem: 87.78 MB of 4092.03 MB Time for 1000 frames: 25.08 s
Processing frame 9000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.98 s
Processing frame 10000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.96 s
Processing frame 11000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 12000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 13000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 14000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 15000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 16000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 24.95 s
Processing frame 17000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 25.04 s
Processing frame 18000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 25.09 s
Processing frame 19000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 25.10 s
Processing frame 20000 GPU mem: 88.78 MB of 4092.03 MB Time for 1000 frames: 25.09 s
linuxfreak
 
Posts: 4
Joined: Wed Nov 02, 2016 4:09 pm

Re: Memory leak in magma_sgetri_outofplace_batched

Postby mgates3 » Thu Nov 10, 2016 12:03 pm

Yes, thank you. We've included your fix (in all 4 precisions).
-mark
mgates3
 
Posts: 750
Joined: Fri Jan 06, 2012 2:13 pm


Return to User discussion

Who is online

Users browsing this forum: Google [Bot] and 7 guests

cron