1
edit
Changes
→Assignment 3
Firstly, in the while loop, there were 6 times of memory copy functions called, and we found that we can reduce 6 times to 1 time by using device address pointer switching.
Furthermore, we found that if the sample number n is less that 1024, we can use shared memory in the kernel.
==== Reduce memory copy function calls ====