Inside Pentium M Architecture

Reservation Station and Execution Units

As we mentioned before, Pentium M uses fused micro-ops (i.e., carries two micro-ops together) from the Decode Unit up to the dispatch ports located on the Reservation Station. The Reservation Station dispatches each micro-op individually (defused).

Pentium M has five dispatch ports numbered 0 through 4 located on its Reservation Station. Each port is connected to one or more execution units, as you can see in Figure 5.

Pentium M Execution UnitsFigure 5: Reservation Station and execution units.

Here is a small explanation of each execution unit found on this CPU:

  • IEU: Instruction Execution Unit is where regular instructions are executed. Also known as ALU (Arithmetic and Logic Unit). “Regular” instructions are also known as “integer” instructions.
  • FPU: Floating Point Unit is where complex math instructions are executed. In the past this unit was also known as “math co-processor”.
  • SIMD: Is where SIMD instructions are executed, i.e., MMX, SSE and SSE2.
  • WIRE: Miscellaneous functions.
  • JEU: Jump Execution Unit processes branches and is also known as Branch Unit.
  • Shuffle: This unit executes a kind of SSE instruction called “shuffle”.
  • PFADD: Executes a SSE instruction called PFADD (Packed FP Add) and also COMPARE, SUBTRACT, MIN/MAX and CONVERT instructions. This unit is pipelined, so it can start executing a new micro-op at each clock cycle even if it didn’t complete the execution of the previous micro-op. This unit has a latency of three clock cycles, i.e., it delays three clock cycles to deliver each processed instruction.
  • Reciprocal Estimates: Executes two SSE instructions, one called RCP (Reciprocal.Estimate) and another called RSQRT (Reciprocal Square Root Estimate).
  • Load: Unit to process instructions that ask a data to be read from the RAM memory.
  • Store Address: Unit to process instructions that ask a data to be written at the RAM memory. This unit is also known as AGU, Address Generator Unit. This kind of instruction uses both Store Address and Store Data units at the same time.
  • Store Data: Unit to process instructions that ask a data to be written at the RAM memory. This kind of instruction uses both Store Address and Store Data units at the same time.

Keep in mind that complex instructions may take several clock cycles to be processed. Let’s take an example of port 0, where the floating point unit (FPU) is located. While this unit is processing a very complex instruction that takes several clock ticks to be executed, port 0 won’t stall: it will keep sending simple instructions to the IEU while the FPU is busy.

So, even thought the maximum dispatch rate is five microinstructions per clock cycle, actually the CPU can have up to twelve microinstructions being processed at the same time.

As we mentioned, on instructions that ask the CPU to read a data stored at a given RAM memory address, the Store Address Unit and the Store Data Unit are used at the same time, one for calculating the address and the other for reading the data.
 
Actually that’s why ports 0 and 1 have more then one execution unit attached. If you pay attention, Intel put on the same port one fast unit together with at least one complex (and slow) unit. So, while the complex unit is busy processing data, the other unit can keep receiving microinstructions from its corresponding dispatch port. As we mentioned before, the idea is to keep all execution units busy all the time.

As we explained, after each micro-op is executed, it returns to the Reorder Buffer, where its flag is set to “executed”. Then at the Retirement Stage the micro-ops that have their “executed” flag on are removed from the Reorder Buffer on its original order (i.e., the order they were decoded) and then the x86 registers are updated (the inverse step of register renaming stage). Up to three micro-ops can be removed from the Reorder Buffer per clock cycle. After this the instruction was fully executed.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *