Changes

Happy Valley

331 bytes added, 08:33, 9 April 2018

→‎Parallelized

<pre>

for (k = 2; k <= N; k = 2 * k)// Cannot be parallel!

{

~~// printf("k = %d \n", k);~~ for (j = k >> 1; j > 0; j = j >> 1)// Cannot be parallel!

{

~~// printf(" - j = %d \n", j);~~ for (i = 0; i<N; i++) {}// Can be parallel! }

}

==== Kernel ====

We can take the code executed in the innermost loop and put it into CUDA kernel. The kernel is launched 'n' times where 'n' is the the total number of elements to be sorted. We pass data allocated on the device memory as well as 'j' & 'k" indices which can be used to indicate the current position in the Sorting Network.

'''Source Code'''

Obelavina

68

edits

Changes

Happy Valley

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools