Open main menu

CDOT Wiki β

Changes

GPU621/NoName

2,165 bytes removed, 19:17, 3 December 2016
Asynchronous Multi-Threading
====C++ 11====
C++ 11 Threads on the contrary always required to specify the number of threads required for a parallel region. If not specified by user input or hard-coding, the number of threads supported by a CPU can also be accurately via the std::thread::hardware_concurrency(); function.
OpenMp automatically decides what order threads will execute. C++ 11 Threads require the developer to specify in what order threads will execute. This is typically done within a for loop block. Threads are created by initializing the std::thread class and specifying a function or any other callable object within the constructor.
 
Example of native thread creating and synchronization using C++ 11
int numThreads = std::thread::hardware_concurrency();
std::vector<std::thread> threads(numThreads);
for (int ID = 0; ID < numThreads; ID++) {
threads[ID] = std::thread(function);
}
 
After the initial creation and execution of a thread, the main thread must either detach or join the thread.
The C++ 11 standard library offers these two member functions for attaching or detaching threads.
* std::thread::join - allows the thread to execute in the background independently from the main thread. The thread will continue execution without blocking nor synchronizing in any way and terminate without relying on the main thread.
* std::thread::detach - waits for the thread to finish execution. Once a thread is created another thread can wait for the thread to finish.
 
Each created thread can then be synchronized with the main thread
for (int i = 0; i < threads.size(); i++){
threads.at(i).join();
}
 
===OpenMp===
Inside a declared OpenMp parallel region, if not specified via an environment variable OMP_NUM_THREADS or the library routine omp_get_thread_num() , OpenMp will automatically decide how many threads are needed to execute parallel code.
An issue with this approach is that OpenMp is unaware how many threads a CPU can support. A result of this can be OpenMp creating 4 threads for a single core processor which may result in a degradation of performance.
 
Automatic thread creation
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
 
Programmer Specified thread creation
int numThreads = 4;
omp_set_num_threads(numThreads);
#pragma omp parallel
{
int tid = omp_get_thread_num();
std::cout << "Hi from thread "
<< tid << '\n';
}
 
===C++ 11===
C++ 11 Threads on the contrary always required to specify the number of threads required for a parallel region. If not specified by user input or hard-coding, the number of threads supported by a CPU can also be accurately via the std::thread::hardware_concurrency(); function.
OpenMp automatically decides what order threads will execute. C++ 11 Threads require the developer to specify in what order threads will execute. This is typically done within a for loop block. Threads are created by initializing the std::thread class and specifying a function or any other callable object within the constructor.
A future is an object that can retrieve a value from some provider object (also known as a promise) or function. Simply put in the case of multithreading, a future object will wait until its associated thread has completed and then store its return value.
To retrieve or construct a future object, these functions may be used.
* Async * promise::get_future * packaged_task::get_future
However, a future object can only be used if it is in a valid state. Default future objects constructed from the std::async template function are not valid and must be assigned a valid state during execution.
A std::future references a shared state that cannot be shared to other asynchronous return objects. If multiple threads need to wait for the same shared state, std::shared_future class template should be used.
===Conclusion===
 
In conclusion while OpenMp is and still continues to be a viable option in multi-threading, it lacks the some of outlined features and lacks low-level control. While C++ 11 standard libarary multi-threading can be more difficult to learn, is supported by virtually all C++ 11 compilers and offers a low-level interaction between hardware threads.