Open main menu

CDOT Wiki β

Changes

The parallelizing Express

394 bytes added, 19:01, 13 April 2017
What was done
=== What was done ===
At first the power function used was switched out with __pow in the kernel as the traditional pow function is more heavy function.Afterwards the kernel was upgrade to implement grid and strides. Other implementations were made to transfer all the data necessary for calculations all at once and perform all calculations on the device side but due to time constraints and the complication of the project we were unable to fully implement these changes. The code is however left (commented) in the included project download.
=== Optimized Kernel ===
49
edits