49
edits
Changes
add magnitude of a vector example and diagram
[[File:Pointer-alias.png|border]]
=== Magnitude of a Vector ===
To demonstrate a more familiar example of a loop-carried dependency that would block the auto-vectorization of a loop, I'm going to include a code snippet that calculates the magnitude of a vector.
To calculate the magnitude of a vector: <code>length = sqrt(x^2 + y^2 + z^)</code>
<source lang="cpp">
for (int i = 0; i < n; i++)
sum += x[i] * x[i];
length = sqrt(sum);
</source>
As you can see, there is a loop-carried dependency with the variable <code>sum</code>. The diagram below illustrates why the loop cannot be vectorized (nor can it be threaded). The dashed rectangle represents a single iteration in the loop, and the arrows represents dependencies between nodes. If an arrow crosses the iteration rectangle, then those iterations cannot be executed in parallel.
[[File:Magnitude-node-dependency-graph.png|border]]
To resolve the loop-carried dependency, use <code>simd</code> and the <code>reduction</code> clause to tell the compiler to auto-vectorize the loop and to reduce the array of elements to a single value. Each SIMD lane will compute its own sum and then
<source lang="cpp">
#pragma omp simd reduction(+:sum)
for (int i = 0; i < n; i++)
sum += x[i] * x[i];
length = sqrt(sum);
</source>
== Memory Alignment ==