Open main menu

CDOT Wiki β

Changes

DPS921/Team team

11,682 bytes added, 16:38, 4 December 2020
no edit summary
== Intel Threading Building Blocks ==
=== Overview of TBB ===
Intel TBB is a library which provides features for parallel programming.It is used for task-based parallelism and requires no special compiler support. TBB has features such as Parallel Algorithms, memory allocation, threads and synchronization. Along side these features are similar features that were talked about in C++11. Naturally, TBB supports features such as containers and iterators, but with additional features to support the parallel nature of the library. Finally, TBB allows for generic programming which removes type requirements.
=== Containers ===
Containers within TBB allow for multiple threads to alter the same container at once. These containers are known as concurrent containers. Examples of these containers are:
<li>parallel_scan</li>
</ul>
 
=== Mutual Exclusion ===
A TBB mutex is created by access tbb::mutex from the <tbb/mutex.h> file. To lock a mutex you call tbb::mutex::lock() and to unlock, tbb::mutex::unlock(). Different mutexes in TBB have a combination of different flavors, or attributes. These attributes have tradeoffs depending on which one is chosen.
<ul>
<li>Scalable - Is a mutex that does not do worse than limiting one execution to a thread. It is generally slow.</li>
<li>Fair/Unfair: Allows for threads to move forward in the program in the order that they arrive. Unfair threads are potentially faster as they will allow for running threads to cut the line over threads that are waiting.</li>
<li>Recursive: Allows for a thread with a lock on the mutex to acquire another lock on the mutex.</li>
<li>Yield or Block: If there is a long wait, a decision will be made if progress can continue, and if it can then yield the processor. If it cannot be made it will block otherwise.</li>
</ul>
<br><br><br>
Taken from threadingbuildingblocks.org is a table displaying different mutexes and the flavors they possess.<br>
 
<table summary="" id="tbl11" frame="border" rules="all" width="100%" cellspacing="0" cellpadding="4" border="1"><caption><span class="tablecap">Traits and Behaviors of Mutexes</span></caption>
<tr>
<th class="cellrowborder" id="d119146e233" width="26.247689463955638%" valign="top">
<p>Mutex
</p>
</th>
<th class="cellrowborder" id="d119146e239" width="17.744916820702404%" valign="top">
<p>Scalable
</p>
</th>
<th class="cellrowborder" id="d119146e245" width="17.56007393715342%" valign="top">
<p>Fair
</p>
</th>
<th class="cellrowborder" id="d119146e251" width="14.602587800369685%" valign="top">
<p>Recursive
</p>
</th>
<th class="cellrowborder" id="d119146e257" width="12.199630314232902%" valign="top">
<p>Long Wait
</p>
</th>
<th class="cellrowborder" id="d119146e264" width="11.645101663585953%" valign="top">
<p>Size
</p>
</th>
</tr>
 
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>blocks
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>≥ 3 words
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">recursive_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>blocks
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>≥ 3 words
</p>
</td>
</tr>
 
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">spin_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 byte
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">speculative_spin_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>HW dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>2 cache lines
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">queuing_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">spin_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">speculative_spin_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>HW dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>3 cache lines
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">queuing_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">null_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>moot
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>never
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>empty
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">null_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>moot
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>never
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>empty
</p>
</td>
</tr>
</table>
<b><u>Figure 2. https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Mutex_Flavors.html</u></b>
 
=== Atomics ===
Virtually the same compared to the Standard Template library. An atomic variable is declared with atomic<Type> variableName. A thread can perform actions on the atomic variable without the need for locks. This is potentially quicker but comes with the downside of having less operations available to perform.
 
== What was added in C++17 STL ==
C++ 17 introduces parallel algorithms and execution policies, and the reduce, inclusive_scan, and exclusive_scan algorithms. This allows for true parallelization to occur, not multitasking. The execution policies that were created in C++17 are what allow for the distinction between running a program sequentially or in parallel.
The algorithms inclusive_scan and exclusive_scan do not require an execution_policy and can simply be called with arguments containing the an iterator pointing to the beginning of a container, and an iterator pointing to the end
== Case Studies between STL & TBB ==
Comparison of TBB parallel sort, STL sort
<pre>
18
edits