Changes

Jump to: navigation, search

GPU621/NoName

2,507 bytes removed, 04:48, 3 December 2016
no edit summary
====C++ 11====
C++ 11 Threads on the contrary always required to specify the number of threads required for a parallel region. If not specified by user input or hard-coding, the number of threads supported by a CPU can also be accurately via the std::thread::hardware_concurrency(); function.
OpenMp automatically decides what order threads will execute. C++ 11 Threads require the developer to specify in what order threads will execute. This is typically done within a for loop block. Threads are created by initializing the std::thread class and specifying a function or any other callable object within the constructor.
 
Example of native thread creating and synchronization using C++ 11
int numThreads = std::thread::hardware_concurrency();
std::vector<std::thread> threads(numThreads);
for (int ID = 0; ID < numThreads; ID++) {
threads[ID] = std::thread(function);
}
 
After the initial creation and execution of a thread, the main thread must either detach or join the thread.
The C++ 11 standard library offers these two member functions for attaching or detaching threads.
* std::thread::join - allows the thread to execute in the background independently from the main thread. The thread will continue execution without blocking nor synchronizing in any way and terminate without relying on the main thread.
* std::thread::detach - waits for the thread to finish execution. Once a thread is created another thread can wait for the thread to finish.
 
Each created thread can then be synchronized with the main thread
for (int i = 0; i < threads.size(); i++){
threads.at(i).join();
}
 
===OpenMp===
Inside a declared OpenMp parallel region, if not specified via an environment variable OMP_NUM_THREADS or the library routine omp_get_thread_num() , OpenMp will automatically decide how many threads are needed to execute parallel code.
An issue with this approach is that OpenMp is unaware how many threads a CPU can support. A result of this can be OpenMp creating 4 threads for a single core processor which may result in a degradation of performance.
 
Automatic thread creation
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
 
Programmer Specified thread creation
int numThreads = 4;
omp_set_num_threads(numThreads);
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
 
===C++ 11===
C++ 11 Threads on the contrary always required to specify the number of threads required for a parallel region. If not specified by user input or hard-coding, the number of threads supported by a CPU can also be accurately via the std::thread::hardware_concurrency(); function.
OpenMp automatically decides what order threads will execute. C++ 11 Threads require the developer to specify in what order threads will execute. This is typically done within a for loop block. Threads are created by initializing the std::thread class and specifying a function or any other callable object within the constructor.

Navigation menu