Last, we have Static Random Access Memory(SRAM), which is extremely fast, but even smaller and more expensive. This type of memory is what is used in the cache.
At the same time, only small amounts piece of data is needed at a given moment. Even if you brought everything in from memory, most of it will be unused. Using complex algorithms, the most relevant data can be stored ahead of time in the cache and RAM. When it needs data, the CPU can look for it looks in the cachefirst. If it is there than , it is a cache hit. If it is not there, it is a cache miss and the CPU must search main memory or even further down the hierarchy until it finds it. Minimizing the number of cache misses ensures the CPU has a steady flow of data it can quickly retrieve and compute. The cache operates on locality of reference which refers to the tendency of programs to access the same set of memory locations repeatedly over a short period of time. From here, there are two major types. Temporal locality is when one memory location is accessed, it will likely be accessed again in the near future. Spatial locality means if one memory locations is accessed, nearby memory locations will likely be needed as well. Using these principles and complex algorithms, data is brought into the cache ahead of time to minimize the number of cache misses.
=== Cache Coherence and Cache Line ===