=== Assignment 3 ===
OptimizeFrom a theoretical perspective, the algorithm should benefit from shared memory access, provided there are operations being performed on the data. To demonstrate this, I modified the kernel to perform a division on each node of the tree, storing the result in the two leaf indices. To do this, I would need to be able to access the root node of a given leaf in a parallel way. The determinant factor would have to be the thread index. Achieving this was a bit of a math problem. [[File:njsimasroot.png|500px ]] The first step was using modulus to set the odd value in the leaf pair equal to the greater value: modIndex =((ti + q) + (ti + 1) % 2) I needed the number of nodes in the previous round, plus one, to later subtract from the current array index: leafTotal ((t / 2) + 1); Basically 'walking backwards'