Codecademy Logo

CPU Cache Memory

Memory Hierarchy

A memory hierarchy organizes different forms of computer memory based on performance. Memory performance decreases and capacity increases for each level down the hierarchy.

Cache memory is placed in the middle of the hierarchy to bridge the processor-memory performance gap.

Memory hierarchy pyramid with the processor at the top, main memory at the bottom, and cache in between.

Cache Memory

Cache is memory placed in between the processor and main memory. Cache is responsible for holding copies of main memory data for faster retrieval by the processor.

Cache memory consists of a collection of blocks. Each block can hold an entry from the main memory. Each entry has the following information:

  • A tag that corresponds to the main memory location
  • The data from the main memory location
Cache memory table with 4 blocks. Two of the blocks have entries.

Cache Hit

A cache hit is when a computer processor finds the data it needs inside cache memory.

When a program requests data from memory, the processor will first look in the cache. If the memory location matches one of the tags in a cache entry the result is a cache hit and the data is retrieved from the cache.

Cache hits improve performance by retrieving data from a smaller and faster memory source.

Animation of a processor checking the cache for a memory location. A cache hit occurs and data is returned from the cache.

Cache Miss

A cache miss is when a computer processor does not find the data it needs in cache memory and must request it from the main memory. The main memory places the memory location and data in as an entry in the cache. The data is then retrieved by the processor from the cache.

Animation of a processor checking the cache for a memory location. A cache miss occurs so the processor requests the data from the main memory. The data is placed in the cache and then retrieved by the processor.

Replacement Policy

A replacement policy defines how data is replaced within the cache. Examples of replacement policies are:

  • Random: The data is replaced randomly. This policy is the easiest to implement within the architecture but the resulting performance increase may be small.
  • Least Recently Used (LRU): The data that has not been accessed for the longest period of time is replaced. This can provide a higher performance increase as the data that is used often stays in the cache. Implementing this policy in the architecture is difficult and may not be worth the cost.
  • First In First Out (FIFO): The data is replaced in the order it was placed in the cache. This can provide a moderate performance increase and is not as difficult to implement in the architecture as LRU. The FIFO policy requires a counter to keep track of which entry is the oldest and next to be replaced.


Associativity, or placement policy, is the process of mapping locations in main memory to specified blocks in the cache.

  • A fully associative cache maps each memory location to any location in the cache.
  • A direct-mapped cache maps each memory location to one location in the cache. This associativity does not require a replacement policy since there is only one cache entry for each location in memory.
  • A set-associative cache maps each memory location to a specified number of locations in cache. A 2-way set-associative cache has 2 blocks per set. A cache with 4 blocks that is 2-way set associative has 2 sets. Each main memory location maps to a set based on the location address.

Write Policy

A cache write policy defines how data is written to the main memory once it is written to the cache.

  • The write-through policy writes data to the cache and the main memory at the same time. This policy is easy to implement in the architecture but is not as efficient since every write to cache is a write to the slower main memory.
  • The write-back policy writes data to the cache but only writes the data to the main memory when the data is about to be replaced in the cache. This policy is more difficult to implement but is more efficient since data is only written to the main memory only when it absolutely needs to be.