44
edits
Changes
no edit summary
{{GPU621/DPS921 Index | 20187}}
<!-- How Threads Works -->
<h4>Implicit Barrier</h4>
<pre class="code">// OpenMP - Parallel Construct
// omp_parallel.cpp
<p>Output:</p>
<pre class="code">Hello
Hello
Hello
Hello
Hello
Hello
Fin
</pre>
<!-- C++11 Threads -->
<p>Unlike OpenMP, C++11 does <i>not</i> use parallel regions as barriers for its threading. When a thread is run using the C++11 thread library, we must consider the scope of the parent thread. If the parent thread would exit before the child thread can return, it can crash the program if not handled correctly.</p>
<p>When using the join function on the child thread, the parent thread will be blocked until the child thread returns.</p>
<pre class="code"> t2
____________________
/ \
__________/\___________________|/\__________
t1 t1 t2.join() | t1
</pre>
<h4>Creating a Thread</h4><p>The following is the template used for the overloaded thread constructor. The thread begins to run on initialization.<br>f is the function, functor, or lambda expression to be executed in the thread. args are the arguements to pass to f.</p><pre class==== Implicit Barrier ===="code">template<class Function, class... Args>explicit thread(Function&& f, Args&&... args);</pre>
<!-- How Multithreading Works -->
<pre class="code">#include <iostream>
#include <omp.h>
int main() { #pragma omp parallel { int tid = Threading in C++11 =omp_get_thread_num(); std::cout << "Hi from thread "<< tid << '\n'; } return 0;}</pre>
<p>Output:</p>
<pre class="code">Hi from thread Hi from Thread 2
0
Hi from thread 1
Hi from thread 3
</pre>
<!-- Threading with C++11 -->
<pre class="code">// cpp11.multithreading.cpp
for (int i = 0; i < numThreads; i++)
threads.push_back(std::thread(func1, i));
for (auto& thread : threads)
thread.join();
<p>Since all threads are using the std::cout stream, the output can appear jumbled and out of order. The solution to this problem will be presented in the next section.</p>
<!-- Syncronization With OpenMP -->
int main()
{
#pragma omp parallel
{
int tid = omp_get_thread_num();
#pragma omp critical
std::cout << "Hi from thread "<< tid << '\n';
}
return 0;
}
</pre>
<p>Using the parallel construct: critical we are able to limit one thread accessing the stream at a time. critical defines the region in which only one thread is allowed to execute at a time. In this case its the cout stream that we are limiting to one thread. The revised code now has an output like this:</p>
<pre class=== Multithreading ==="code">Hi from thread 0Hi from Thread 1Hi from thread 2Hi from thread 3</pre>
<!-- parallel for -->
<p>In OpenMp there is a way of parallelizing a for loop by using the parallel construct for. This statement will automatically distribute iterations between threads.</p>
<p>Example:</p><pre class="code">void simple(int n, float *a, float *b) { int i; #pragma omp parallel for for (i = 1; i < n; i++) b[i] = (a[File:Cppmultithreading.png | 500pxi]+ a[i-1]) / 2.0;}</pre>
<h4>mutex</h4>
<p>To allow for thread syncronization, we can use the mutex library to lock specific sections of code from being used by multiple threads at once.</p>
std::mutex mu;
int main() { int numThreads ==== Using Mutex ====10;
std::vector<std::thread> threads;
std::cout << "All threads have launched!\n";
std::cout << "Syncronizing...\n";
std::cout << "All threads have syncronized!\n";
<pre class="code">Creating threads...
Index: 0 - ID: 0x70000aa29000
Index: 4 - ID: 0x70000ac35000
Index: 5 - ID: 0x70000acb8000
Index: 1 - ID: 0x70000aaac000
Index: 6 - ID: 0x70000ad3b000
Index: 7 - ID: 0x70000adbe000
Index: 8 - ID: 0x70000ae41000
Index: 3 - ID: 0x70000abb2000
All threads have launched!
Syncronizing...
Index: 9 - ID: 0x70000aec4000
Index: 2 - ID: 0x70000ab2f000
All threads have syncronized!
</pre>