212
edits
Changes
BetaT
,→REMOVING THE UNNECESSARY ARRAY
=== REMOVING THE UNNECESSARY ARRAY ===
As we discovered above, the second array is not necessary while we are performing all the calculations on Shared Memory which can be seen in section 3.3.2. This provides us with the ability to further optimize our Kernel by reducing the amount of time we spend transferring data across the PCI bus. Below is an image of the data transfer times for the CALCULATE kernel.
[[File:MEmCpy10000.png]]
Since both of the original Arrays are not needed in the final Kernel solution we can save 50% of our transfer time across the PCI bus.