44
edits
Changes
no edit summary
{{GPU621/DPS921 Index | 20187}}
<!-- How Threads Works -->
<h4>Implicit Barrier</h4>
<pre class="code">// OpenMP - Parallel Construct
// omp_parallel.cpp
<p>Output:</p>
<pre class="code">Hello
Hello
Hello
Hello
Hello
Hello
Fin
</pre>
<!-- C++11 Threads -->
<p>Unlike OpenMP, C++11 does <i>not</i> use parallel regions as barriers for its threading. When a thread is run using the C++11 thread library, we must consider the scope of the parent thread. If the parent thread would exit before the child thread can return, it can crash the program if not handled correctly.</p>
<p>When using the join function on the child thread, the parent thread will be blocked until the child thread returns.</p>
<pre class="code"> t2
____________________
/ \
__________/\___________________|/\__________
t1 t1 t2.join() | t1
</pre>
<h4>Creating a Thread</h4><p>The following is the template used for the overloaded thread constructor. The thread begins to run on initialization.<br>f is the function, functor, or lambda expression to be executed in the thread. args are the arguements to pass to f.</p><pre class==== Using Barrier ===="code">template<class Function, class... Args>explicit thread(Function&& f, Args&&... args);</pre>
<!-- How Multithreading Works -->
<!-- Multithreading With OpenMP -->
int main() {
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "<< tid << '\n';
}
return 0;
}
</pre>
<p>Output:</p><pre class= Threading in C++11 ="code">Hi from thread Hi from Thread 20Hi from thread 1Hi from thread 3</pre>
<p>Essentially what is happening in the code above is that the threads are intermingling creating a jumbled output. All threads are trying to access the cout stream at the same time. As one thread is in the stream another may interfere with it because they are all trying to access the stream at the same time. </p>
<!-- Threading in with C++11 is available through the <thread--> library. C++11 relies mostly on joining or detaching forked subthreads.
<h3>Threading with C++11</h3>
<p>Unlike OpenMP, C++11 threads are created by the programmer instead of the compiler.</p>
<p>std::this_thread::get_id() is similar to OpenMP's omp_get_thread_num() but instead of an int, it returns a </p>
<pre class=== Join vs Detach ==="code">// cpp11.multithreading.cpp
#include <iostream>
#include <vector>
#include <thread>
std::cout << "All threads have launched!\n";
std::cout << "Syncronizing...\n";
std::cout << "All threads have syncronized!\n";
<pre class="code">Creating threads...
Index: 0 - ID: Index: 1 - ID: Index: 2 - ID: 0x70000b57e000
0x70000b4fb000
0x70000b601000Index: 3 - ID: 0x70000b684000
Index:
4 - ID: 0x70000b707000
Index: 5 - ID: 0x70000b78a000
Index: 6 - ID: 0x70000b80d000
Index: 7 - ID: 0x70000b890000
Index: All threads have launched!
8 - ID: 0x70000b913000
Index: Syncronizing...
9 - ID: 0x70000b996000
All threads have syncronized!
</pre>
<h3>Syncronization With OpenMP</h3>
<p>Using the parallel construct: critical we are able to limit one thread accessing the stream at a time. critical defines the region in which only one thread is allowed to execute at a time. In this case its the cout stream that we are limiting to one thread. The revised code now has an output like this:</p>
<pre class="code">Hi from thread 0
Hi from Thread 1
Hi from thread 2
Hi from thread 3
</pre>
<h4>parallel for</h4>
<p>Example:</p>
<pre class="code">void simple(int n, float *a, float *b) {
int i;
#pragma omp parallel for
for (i = 1; i < n; i++)
b[i] = (a[i] + a[i-1]) / 2.0;
}
</pre>
<pre class="code">// cpp11.mutex.cpp
void func1(int index) {
std::lock_guard<std::mutex> lock(mu);
// mu.lock();
std::cout << "Index: " << index << " - ID: " << std::this_thread::get_id() << std::endl;
// mu.unlock();
}
int main() { int numThreads === Synchronization === 10;
std::cout << "Creating threads...\n";
for (auto& thread : threads)
thread.join();
return 0;
}
</pre>
<p>Using mutex, we're able to place a lock on the data used by the threads to allow for mutual exclusion. This is similar to OpenMP''Output:'''s critical in that it only allows one thread to execute a block of code at a time.</p>
<!-- How Data Sharing Works -->