How a CPU Works

Branching

As we mentioned several times, one of the main problems for the CPU is having too many cache misses, because the fetch unit must access directly the slow RAM memory, thus slowing down the system.
Usually the use of the memory cache avoids this a lot, but there is one typical situation where the cache controller will miss: branches. If in the middle of the program there is an instruction called JMP (“jump” or “go to”) sending the program to a completely different memory position, this new position won’t be loaded in the L2 memory cache, making the fetch unit to go get that position directly in the RAM memory. In order to solve this issue, the cache controller of modern CPUs analyze the memory block it loaded and whenever it finds a JMP instruction in there it will load the memory block for that position in the L2 memory cache before the CPU reaches that JMP instruction.

Unconditional BranchingFigure 8: Unconditional branching situation.

This is pretty easy to implement, the problem is when the program has a conditional branching, i.e., the address the program should go to depends on a condition not yet known. For example, if a =< b go to address 1, or if a > b go to address 2. We illustrate this example in Figure 9. This would make a cache miss, because the values of a and b are unknown and the cache controller would be looking only for JMP-like instructions. The solution: the cache controller loads both conditions into the memory cache. Later, when the CPU processes the branching instruction, it will simply discard the one that wasn’t chosen. It is better to load the memory cache with unnecessary data than directly accessing the RAM memory.

Conditional BranchingFigure 9: Conditional branching situation.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *