18
edits
Changes
no edit summary
== Intel Threading Building Blocks ==
=== Overview of TBB ===
=== Containers ===
Containers within TBB allow for multiple threads to alter the same container at once. These containers are known as concurrent containers. Examples of these containers are:
<li>parallel_scan</li>
</ul>
=== Mutual Exclusion ===
A TBB mutex is created by access tbb::mutex from the <tbb/mutex.h> file. To lock a mutex you call tbb::mutex::lock() and to unlock, tbb::mutex::unlock(). Different mutexes in TBB have a combination of different flavors, or attributes. These attributes have tradeoffs depending on which one is chosen.
<ul>
<li>Scalable - Is a mutex that does not do worse than limiting one execution to a thread. It is generally slow.</li>
<li>Fair/Unfair: Allows for threads to move forward in the program in the order that they arrive. Unfair threads are potentially faster as they will allow for running threads to cut the line over threads that are waiting.</li>
<li>Recursive: Allows for a thread with a lock on the mutex to acquire another lock on the mutex.</li>
<li>Yield or Block: If there is a long wait, a decision will be made if progress can continue, and if it can then yield the processor. If it cannot be made it will block otherwise.</li>
</ul>
<br><br><br>
Taken from threadingbuildingblocks.org is a table displaying different mutexes and the flavors they possess.<br>
<table summary="" id="tbl11" frame="border" rules="all" width="100%" cellspacing="0" cellpadding="4" border="1"><caption><span class="tablecap">Traits and Behaviors of Mutexes</span></caption>
<tr>
<th class="cellrowborder" id="d119146e233" width="26.247689463955638%" valign="top">
<p>Mutex
</p>
</th>
<th class="cellrowborder" id="d119146e239" width="17.744916820702404%" valign="top">
<p>Scalable
</p>
</th>
<th class="cellrowborder" id="d119146e245" width="17.56007393715342%" valign="top">
<p>Fair
</p>
</th>
<th class="cellrowborder" id="d119146e251" width="14.602587800369685%" valign="top">
<p>Recursive
</p>
</th>
<th class="cellrowborder" id="d119146e257" width="12.199630314232902%" valign="top">
<p>Long Wait
</p>
</th>
<th class="cellrowborder" id="d119146e264" width="11.645101663585953%" valign="top">
<p>Size
</p>
</th>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>blocks
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>≥ 3 words
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">recursive_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>OS dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>blocks
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>≥ 3 words
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">spin_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 byte
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">speculative_spin_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>HW dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>2 cache lines
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">queuing_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">spin_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">speculative_spin_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>HW dependent
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>3 cache lines
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">queuing_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>no
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>yields
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>1 word
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">null_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>moot
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>never
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>empty
</p>
</td>
</tr>
<tr>
<td class="cellrowborder" headers="d119146e233 " width="26.247689463955638%" valign="top">
<p><samp class="codeph">null_rw_mutex</samp>
</p>
</td>
<td class="cellrowborder" headers="d119146e239 " width="17.744916820702404%" valign="top">
<p>moot
</p>
</td>
<td class="cellrowborder" headers="d119146e245 " width="17.56007393715342%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e251 " width="14.602587800369685%" valign="top">
<p>✓
</p>
</td>
<td class="cellrowborder" headers="d119146e257 " width="12.199630314232902%" valign="top">
<p>never
</p>
</td>
<td class="cellrowborder" headers="d119146e264 " width="11.645101663585953%" valign="top">
<p>empty
</p>
</td>
</tr>
</table>
<b><u>Figure 2. https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Mutex_Flavors.html</u></b>
=== Atomics ===
Virtually the same compared to the Standard Template library. An atomic variable is declared with atomic<Type> variableName. A thread can perform actions on the atomic variable without the need for locks. This is potentially quicker but comes with the downside of having less operations available to perform.
== What was added in C++17 STL ==
C++ 17 introduces parallel algorithms and execution policies, and the reduce, inclusive_scan, and exclusive_scan algorithms. This allows for true parallelization to occur, not multitasking. The execution policies that were created in C++17 are what allow for the distinction between running a program sequentially or in parallel.
The algorithms inclusive_scan and exclusive_scan do not require an execution_policy and can simply be called with arguments containing the an iterator pointing to the beginning of a container, and an iterator pointing to the end
== Case Studies between STL & TBB ==
Comparison of TBB parallel sort, STL sort
<pre>