I want to collect the communicating volume between SM and global memory, in other words I want to compare the communicating volume between different algorithm, and I want to know which the communicating of the algorithm is smaller. So I should get by means of "CUPTI Metric API", I got from the CUPTI_Users_Guid,but I can't got the mainly idea of the issues in Table 11 on page 30. they are below:
(1)gld_efficiency: Ratio of requested global memory load transactions to actual global memory load transactions, For CC 1.2 & 1.3: (gld_request/((gld_32 + gld_64 +gld_128)/(2 * #SM)))
(2)gst_efficiency: Ratio of requested global memory store transactions to actual global memory store transactions, For CC 1.2 & 1.3: (gst_request/((gst_32 + gst_64 +gst_128) / (2 * #SM)))
(3)gld_requested_throughput,Requested global memory load throughput, (gld_32 * 32 + gld_64 * 64 +gld_128 * 128) / (gputime)
(4)gst_requested_throughput, Requested global memory store throughput,(gst_32 * 32 + gst_64 * 64 +gst_128 * 128) / (gputime)
My GPU is GTX 285.
Could someone explain the idea of the upper four items for me?
The other question is that the upper four items is enough for the communicating volume between SM and Global memory what I want for?
Thanks a lot!