Changes

GPU621/False Sharing

425 bytes added, 23:49, 26 November 2021

→‎Synchronization

=== Synchronization ===

We want a more elegant solution that avoids have race conditions and ~~sharing the same~~ cache line sharing without resorting to padding. ~~Fortunately, OpenMP provides constructs that can help. After switching~~

<pre>

~~for (int i = id; i < n; i = i + num_threads)~~#pragma omp parallel

{

~~x = ((double)i + 0.5f) * step~~int id, num_threads; double x, sum += ~~1.0f / (1~~0.0f ~~+ x * x)~~;}

id = omp_get_thread_num(); num_threads = omp_get_num_threads(); // get master thread to return how many threads were actually created if (id == 0) { actual_thread_count = num_threads; } // each thread is responsible for calculating the area of a specific set of sections underneath the curve for (int i = id; i < n; i = i + num_threads) { x = ((double)i + 0.5f) * step; sum += 1.0f / (1.0f + x * x); } #pragma omp critical { // sum up each calculation to get approximation of pi pi += 4 * sum * step; }

}

</pre>

== Conclusion ==

Kchou4

83

edits

Changes

GPU621/False Sharing

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools