68
edits
Changes
→Assignment 2
=== Parallelized ===
The image below demonstrates the way Bitonic method computes the sorted values. The algorithm works as following: you take 2 sequences and sort them,
Reverse one & append it to the other to create a bitonic sequence, then you split the seequence into larger & smaller half and sort each half (and repeat the process).
Image below shows that each time 2 values are on either side of a shared vertical line (red), they are swapped if they are in the wrong order. In theory, these swap operations are '''independent''' and are great for parallel exposure. 2 outer loops cannot be parallelized since the most-inner loop depends on previously calculated (swapped) values. However, we can parallelize the 3rd loop that goes from 0 to N where N is the total number of elements in the input array.
<pre>
for (k = 2; k <= N; k = 2 * k)
{
// printf("k = %d \n", k);
for (j = k >> 1; j > 0; j = j >> 1)
{
// printf(" - j = %d \n", j);
for (i = 0; i<N; i++) {}
}
}
</pre>
[[File:Bitonic1.png|800px]]
==== Kernel ====