Changes

Jump to: navigation, search

GPU621/Analyzing False Sharing

3,282 bytes added, 12:43, 26 November 2022
no edit summary
# [mailto:yppadsala@myseneca.ca?subject=GPU621 Yash Padsala]
# [mailto:sgpatel22@myseneca.ca?subject=GPU621 Shani Patel]
 
= '''Preface''' =
In concurrency, pseudo-sharing is equivalent to a performance killer in multicore concurrent programming, and contention is equivalent to a "performance assassin”. Assassins differ from killers because a killer can be seen and fought, run, detoured, begged for mercy, or be stopped, but a killer cannot be seen and disguised. It is impossible to prevent the "assassin" from lurking in the shadows and waiting for an opportunity to strike. We can take a variety of measures (such as shortening the critical area, atomic operations, etc.) when we encounter lock contention that affects concurrency performance in concurrent programming. The code we write does not see pseudo-sharing, so we are unable to find it and cannot fix it, so we cannot improve the performance of the program. We can't make any changes to this, which results in pseudo-sharing in the dark, which slows concurrency performance down significantly.
 
= '''What is False Sharing?''' =
False Sharing is sharing pattern that happens when multiple applications run and share the same memory. When more than one thread read or updates the same data in logical memory it ends up in the same catch line. Each application’s cache is a copy of a common source. So, when we modify one cache it will cause others to reload from the common source. In other words, when one program makes a change in one cache and it does not change the data which is used by the second program, this also forces another cache to reload so, in this scenario reload cache is a useless system resource and it may create a negative impact on the performance of the program. It is not easy to catch false sharing and stop it. However, there are some ways which help us to overcome false sharing.
 
The foremost reason why false sharing happens can be found in how operating systems read and write data. When a program reads or writes data from the hard drive or other sources at that time this data loads into a temporary cache. Because it is a very fast way to access it. This knows as a cache line.
 
= '''Cache''' =
Basically, the cache is a location where data is stored for CPU use. But there are many types of memory for the CPU. Ideally, CPU can not access every data with lightning speed. So, let's talk about it in brief. In our PC and laptops, we have hard drives and SSDs which are slow in speed, but they can store big data. So, when the CPU tries to access this data it will take some time, so this process takes more computational time.
 
Moreover, DRMS is quite expensive. However, it is fast but still, DRMS is not enough for CPUs. DRMS works on electricity. Hence if power is cut that time data is not accessible. Furthermore, SRAM is very fast and smaller but expensive. SRAM memory used in cache. So, the first CPU finds data from the cache and if data is not in the cache, then the CPU starts to find it in the main memory and tries until it finds that data. So, this data transfers into the cache. This transfer works in two ways. The first way is that the same data is transferred into the cache for a short period of time and this is known as temporal locality. Another one is a spatial locality and this method also accessed nearby locations if it will be needed.
 
[[File:cache.jpg|right|400px]]<br />
= '''Solutions Of False Sharing''' =
118
edits

Navigation menu