• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Hardware Secrets

Hardware Secrets

Uncomplicating the complicated

  • Case
  • Cooling
  • Memory
  • Mobile
    • Laptops
    • Smartphones
    • Tablets
  • Motherboard
  • Networking
  • Other
    • Audio
    • Cameras
    • Consumer Electronics
    • Desktops
    • Museum
    • Software
    • Tradeshows & Events
  • Peripherals
    • Headset
    • Keyboard
    • Mouse
    • Printers
  • Power
  • Storage
  • Video

How The Cache Memory Works

In this tutorial you will learn everything you need to know about cache memory in an easy to follow language.

Home » How The Cache Memory Works

Introduction

Contents

  • 1. Introduction
  • 2. Dynamic RAM vs. Static RAM
  • 3. History of Memory Cache on PCs
  • 4. Meet The Memory Cache
  • 5. L2 Memory Cache on Multi-Core CPUs
  • 6. How It Works
  • 7. Memory Cache Organization
  • 8. n-Way Set Associative Cache
  • 9. Memory Cache Configuration on Current CPUs

The cache memory is high-speed memory available inside the CPU in order to speed up access to data and instructions stored in RAM memory. In this tutorial we will explain how this circuit works in an easy to follow language.
A computer is completely useless if you don’t tell the processor (i.e., the CPU) what to do. This is done through a program, which is a list of instructions telling the CPU what to do.

Preview Product
Intel Core i7-8700K Desktop Processor 6 Cores up to 4.7GHz Turbo Unlocked LGA1151 300 Series 95W Intel Core i7-8700K Desktop Processor 6 Cores up to 4.7GHz Turbo Unlocked LGA1151 300 Series 95W Buy on Amazon

The CPU fetches programs from the RAM memory. The problem with the RAM memory is that when it’s power is cut, it’s contents are lost – this classifies the RAM memory as a “volatile” medium. Thus programs and data must be stored on non-volatile media (i.e., where the contents aren’t lost after your turn your PC off) if you want to have them back after you turn off your PC, like hard disk drives and optical media like CDs and DVDs.
When you double click an icon on Windows to run a program, the program, which is usually stored on the computer’s hard disk drive, is loaded into the RAM memory, and then from the RAM memory the CPU loads the program through a circuit called memory controller, which is located inside the chipset (north bridge chip) on Intel processors or inside the CPU on AMD processors. In Figure 1 we summarize this (for AMD processors please ignore the chipset drawn).

How the CPU WorksFigure 1: How stored data is transferred to the CPU.

The CPU can’t fetch data directly from hard disk drives because they are too slow for it, even if you consider the fastest hard disk drive available. Just to give you some idea of what we are talking about, a SATA-300 hard disk drive – the fastest kind of hard disk drive available today for the regular user – has a maximum theoretical transfer rate of 300 MB/s. A CPU running internally at 2 GHz with 64-bit internal datapaths* will transfer data internally at 16 GB/s – over 50 times faster.
* Translation: the paths between the CPU internal circuits. This is rough math just to give you an idea, because CPUs have several different datapaths inside the CPU, each one having different lengths. For example, on AMD processors the datapath between the L2 memory cache and the L1 memory cache is 128-bit wide, while on current Intel CPUs this datapath is 256-bit wide. If you got confused don’t worry. This is just to explain that the number we published in the above paragraph isn’t fixed, but the CPU is always a lot faster than hard disk drives.
The difference in speed comes from the fact that hard disk drives are mechanical systems, which are slower than pure electronics systems, as mechanical parts have to move for the data to be retrieved (which is far slower than moving electrons around). RAM memory, on the other hand, is 100% electronic, thus faster than hard disk drives and optimally as fast as the CPU.
And here is the problem. Even the fastest RAM memory isn’t as fast as the CPU. If you take DDR2-800 memories, they transfer data at 6,400 MB/s – 12,800 MB/s if dual channel mode is used. Even though this number is somewhat close to the 16 GB/s from the previous example, as current CPUs are capable of fetching data from the L2 memory cache at 128- or 256-bit rate, we are talking about 32 GB/s or 64 GB/s if the CPU works internally at 2 GHz. Don’t worry about what the heck “L2 memory cache” is right now, we will explain it later. All we want is that you get the idea that the RAM memory is slower than the CPU.
By the way, transfer rates can be calculated using the following formula (on all examples so far “data per clock” is equal to “1”):
Transfer rate = width (number of bits) x clock rate x data per clock / 8
The problem is not only the transfer rate, i.e., the transfer speed, but also latency. Latency (a.k.a. “access time”) is how much time the memory delays in giving back the data that the CPU asked for – this isn’t instantaneous. When the CPU asks for an instruction (or data) that is stored at a given address, the memory delays a certain time to deliver this instruction (or data) back. On current memories, if it is labeled as having a CL (CAS Latency, which is the latency we are talking about) of 5, this means that the memory will deliver the asked data only after five memory clock cycles – meaning that the CPU will have to wait.
Waiting reduces the CPU performance. If the CPU has to wait five memory clock cycles to receive the instruction or data it asked for, its performance will be only 1/5 of the performance it would get if it were using a memory capable of delivering data immediately. In other words, when accessing a DDR2-800 memory with CL5, the performance the CPU gets is the same as a memory working at 160 MHz (800 MHz / 5). In the real world the performance decrease isn’t that much because memories work under a mode called burst mode where from the second data on, data can be delivered immediately, if it is stored on a contiguous address (usually the instructions of a given program are stored in sequential addresses). This is expressed as “x-1-1-1” (e.g., “5-1-1-1” for the memory in our example), meaning that the first data is delivered after five clock cycles but from the second data on data can be delivered in just one clock cycle – if it is stored on a contiguous address, like we said.

Last update on 2022-03-17 at 17:02 / Affiliate links / Images from Amazon Product Advertising API

Continue: Dynamic RAM vs. Static RAM

CPU Tutorials

Primary Sidebar

As a participant in the Amazon Services LLC Associates Program, this site may earn from qualifying purchases. We may also earn commissions on purchases from other retail websites.

audio connectors on a motherboard (right) and ethernet + usb connectors (left)

How On-Board Audio Works

Learn how the sound card that comes embedded on your motherboard works.

How To Connect Your PC to Your Home Stereo or Home Theater

Learn how to hook your PC to your stereo or receiver in order to enhance you audio experience while playing games, watching videos, listening to music or even editing audio.

motherboard

Which is the best motherboard for Coffee Lake CPUs?

We compared seven different motherboards for Intel eighth-gen (Coffee Lake) CPUs, to help you to choose which one is the best for you. Check it out!

RAM Install

Does more RAM make difference in gaming performance?

Does installing more RAM in your computer improves gaming performance? We tested some recent games with 4 GiB, 8 GiB, and 16 GiB to find out. Check it out!

How to Refill Epson Cartridges

Learn how to reset the Epson cartridge chip, allowing you to refill the cartridge.

Footer

For Performance

  • PCI Express 3.0 vs. 2.0: Is There a Gaming Performance Gain?
  • Does dual-channel memory make difference on integrated video performance?
  • Overclocking Pros and Cons
  • All Core i7 Models
  • Understanding RAM Timings

Everything you need to know

  • Everything You Need to Know About the Dual-, Triple-, and Quad-Channel Memory Architectures
  • Everything You Need to Know About the SPDIF Connection
  • Everything You Need to Know About the Intel Virtualization Technology
  • Everything You Need to Know About the CPU C-States Power Saving Modes

Follow Us

Follow us on Facebook Follow us on Twitter Follow us on Instagram

Copyright © 2022 · All rights reserved - Hardwaresecrets.com
About Us · Privacy Policy · Contact