45
edits
Changes
no edit summary
Inside a declared OpenMp parallel region, if not specified via an environment variable OMP_NUM_THREADS or the library routine omp_get_thread_num() , OpenMp will automatically decide how many threads are needed to execute parallel code.
An issue with this approach is that OpenMp is unaware how many threads a CPU can support. A result of this can be OpenMp creating 4 threads for a single core processor which may result in a degradation of performance.
Automatic thread creation
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
Programmer Specified thread creation
int numThreads = 4;
omp_set_num_threads(numThreads);
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
C++ 11 Threads on the contrary always required to specify the number of threads required for a parallel region. If not specified by user input or hardcoding, the number of threads supported by a CPU can also be accurately via the std::thread::hardware_concurrency(); function.
OpenMp automatically decides what order threads will execute. C++ 11 Threads require the developer to specify in what order threads will execute. This is typically done within a for loop block.
Native Threads creation
int numThreads = std::thread::hardware_concurrency();
std::vector<std::thread> threads(numThreads);
for (int ID = 0; ID < numThreads; ID++) {
threads[ID] = std::thread(function);
}
===Parallelizing for Loops===
Finished at 63 milliseconds
Native SPMD Implementation using mutex locking barrier. std::bind() allows the user to specify the range for each thread.
#include <iostream>
#include <chrono>
Finished at 6 milliseconds
===Programming Models===