96
edits
Changes
→Eliminating False Sharing
===Thread Local Variables===
Wasting memory to put your data on different cache lines is not ideal solution to the False Sharing problem even though it works. There are 2 problems with this solution: 1 you're wasting memory of course and 2 this solution isn't scalable because you aren't always going to know the L1 cache line size. Using variables local to each thread, instead of contiguous array locations reduces the number of times that a thread will write to a cache line that shares data with threads. The benefit to this approach is that you do not have multiple threads writing to the same cache line, invalidating the data and bottlenecking the processes.
<source lang="cpp">
#include <iostream>
}
</source>
[[File:ExecutionSpeedupLocalSpeedupTl.png|800px|center|frame]]
= Intel VTune Amplifier =