Open main menu

CDOT Wiki β

Changes

GPU621/False Sharing

425 bytes added, 22:49, 26 November 2021
Synchronization
=== Synchronization ===
We want a more elegant solution that avoids have race conditions and sharing the same cache line sharing without resorting to padding. Fortunately, OpenMP provides constructs that can help. After switching
<pre>
for (int i = id; i < n; i = i + num_threads)#pragma omp parallel
{
x = ((double)i + 0.5f) * stepint id, num_threads; double x, sum += 1.0f / (10.0f + x * x);}
id = omp_get_thread_num(); num_threads = omp_get_num_threads();  // get master thread to return how many threads were actually created if (id == 0) { actual_thread_count = num_threads; }  // each thread is responsible for calculating the area of a specific set of sections underneath the curve for (int i = id; i < n; i = i + num_threads) { x = ((double)i + 0.5f) * step; sum += 1.0f / (1.0f + x * x); }  #pragma omp critical { // sum up each calculation to get approximation of pi pi += 4 * sum * step; }
}
</pre>
== Conclusion ==
83
edits