testing examples: why is ldda a multiple of 32?

Open discussion for MAGMA

testing examples: why is ldda a multiple of 32?

Postby ffox80 » Thu Jul 21, 2011 2:00 pm

I noticed that for some routines like zgemm and zgesv the LDA dimensions are set to be multiples of 32. Is this preferred for some reason?
ffox80
 
Posts: 7
Joined: Thu Jul 21, 2011 4:26 am

Re: testing examples: why is ldda a multiple of 32?

Postby Stan Tomov » Mon Jul 25, 2011 12:02 pm

Calling cudaMalloc properly aligns the beginning of the floating point data allocated for fully coalescent accesses (at the beginning of the data). The starting address for the cards before Fermi had to be aligned at 16*sizeof(type). In order for this to hold for columns after the first, one has to ensure that the lda is divisible by 16. We make this requirement a little stronger (divisibility by 32) in anticipation of hardware changes that may require it.
Stan
Stan Tomov
 
Posts: 247
Joined: Fri Aug 21, 2009 10:39 pm


Return to User discussion

Who is online

Users browsing this forum: No registered users and 26 guests