Open main menu

CDOT Wiki β

Team Darth Vector

Revision as of 17:18, 4 December 2017 by Agodwin (talk | contribs) (Parallelism Problems in STL)

TEAM, use this for formatting. Wiki Editing Cheat Sheet


Join me, and together we can fork the problem as master and thread

Members


Alistair Godwin

Giorgi Osadze

Leonel Jara


TBB Background

Containers Comparison

List of TBB containers:

concurrent_queue : Multiple threads may simultaneously push and pop elements from the queue.

concurrent_vector : This is a container class for vectors with concurrent support. These vectors do not support insertion or erase operations but supports concurrent operations. Note that when elements are inserted, they cannot be removed without calling the clear() member function on it, which removes every element in the array. This is defined within the header "tbb/concurrent_vector.h" and is coded as:
#include <tbb/concurrent_vector.h>
//...//

 tbb:concurrent_vector<type> name; 

concurrent_hash_map : hash table that permits concurrent accesses.

quotes: Intel Threaded Building Blocks book. Highly concurrent containers are very important because Standard Template Library (STL) containers generally are not concurrency-friendly, and attempts to modify them concurrently can easily corrupt the containers.

Algorithms

parallel_for

parallel_scan

parallel_reduce


Threads

STL Background

Containers Comparison

pair,vector,list,slist,

Algorithms

TBB Threads


Lock Convoying Problem

What is a Lock?

A Lock(also called "mutex") is a method for programmers to secure code that when executing in parallel can cause multiple threads to fight for a resource/container for some operation. When threads work in parallel to complete a task with containers, there is no indication when the thread reach the container and need to perform an operation on it. This causes problems when multiple threads are accessing the same place. When doing an insertion on a container with threads, we must ensure only 1 thread is capable of pushing to it or else threads may fight for control. By "Locking" the container, we ensure only 1 thread accesses at any given time.

To use a lock, you program must be working in parallel(ex #include <thread>) and should be completing something in parallel. You can find c++11 locks with #include <mutex>

Code example or Picture here ^_^

#include <iostream>
#include <thread> 
#include <mutex>

//Some threads are spawned which call this function
//Declared the following within the class std::mutex NightsWatch;
void GameOfThronesClass::GuardTheWall(){

//Protect until Unlock() is called. Only 1 thread may do this below at a time. It is //"locked"
NightsWatch.Lock();

//Increment
DaysWithoutWhiteWalkerAttack++;
std::cout << "It has been " << DaysWithoutWhiteWalkerAttack << " since the last attack at Castle Black!\n";

//Allow Next thread to execute the above iteration
NightsWatch.Unlock();


}

Note that there can be problems with locks. If a thread is locked but it is never unlocked, any other threads will be forced to wait which may cause performance issues. Another problem is called "Dead Locking" where each thread may be waiting for another to unlock (and vice versa) and the program is forced to wait and wait .

Parallelism Problems in STL

Within STL, issues arise when you attempt to access containers in parallel. With containers, when threads update the container say with push back, it is difficult to determine where the insertion occurred within the container(each thread is updating this container in any order) additionally it the size of the container is unknown as each thread may update the size as it goes. For example, in a vector we can push some data to it in parallel but knowing where that data was pushed to requires as to search the vector serially for that data's location. If we attempt to find the data in parallel, 1 thread could search for the data, but another could update the vector during that time which causes problems with thread 1's search.

Locks can solve the above issue but cause significant performance issues as the threads are forced to wait for each other.

 
Performance issues inside stl

Lock Convoying in TBB

Put another picture here


Study Ref: https://software.intel.com/en-us/blogs/2008/10/20/tbb-containers-vs-stl-performance-in-the-multi-core-age This leads into Concurrent_vector growing below..

Efficiency Comparison Parallel for and concurrent_vector

If only you knew the power of the Building Blocks

Concept: Fine-grained locking

Multiple threads operate on the container by locking only those portions they really need to lock.

Concept: Lock-free algorithms

Bits of knowledge:

STL interfaces are inherently not thread-safe.

Threading Building Blocks containers are not templated with an allocator argument.

Links

http://www.cs.northwestern.edu/~riesbeck/programming/c++/stl-summary.html

http://www.cplusplus.com/reference/stl/

https://www.inf.ed.ac.uk/teaching/courses/ppls/TBBtutorial.pdf

Code Implementation Comparison for STL and TBB

Parallel Algorithms support

C++11 STL does not have much to offer for parallel algorithms natively unlike TBB

Though there are new algorithms for parallelism in the C++17 standard

More: http://www.bfilipek.com/2017/08/cpp17-details-parallel.html

Resource management

Overall partitioning, thread creation, and management is hidden in TBB

Thread Creation, terminating, and synchronizing is handled by TBB

The thread creation and overall resource for STL is managed by a combination of libraries;

-thread for creation and joining

-mutex for mutual exclusion and locks

-future for accessing results of asynchronous operations

Details: http://en.cppreference.com/w/cpp/thread

Implementation Safety

TBB specifically makes concurrent vector not to support insert and erase operations. Only new items can only be pushed back, and cannot be shrunk;

-This prevents devs to write bad code

-Allowing this could cause a mistake on implementing parallel code. Which will cause a big performance hit, burdening both iterating and growing operations. Which will make the concurrent containers in TBB unless due to this risk.

Note for TBB (to trash less on STL)

-Some operations in TBB containers are not thread safe, like reserve() and clear() in concurrent vector.

-Complexity on concurrent containers is higher compared to STL. For layouts and implementation. This due to the nature of running in parallel and how TBB implements it.