Penryn Core New Features
FPU Enhancements
Contents
The new Penryn core brings two enhancements to the CPU floating-point unit (FPU), one for its divider engine and another for its shuffle engine.
Fast Radix-16 Divider
This is an enhancement on the way that the CPU floating-point unit (FPU) handles division operations. On Core 2 CPUs, division operations process two bits per clock cycle. The new divider circuit implemented on Penryn is able to process four bits per clock cycle, meaning it is two times faster on division operations that Core 2 CPUs.
In Figure 7, you can see a comparison between the FPU of the Core 2 Duo CPU and the FPU of the new Penryn core. The “y” axis represents clock cycles, so the lower the bars, the better (less time is spend processing an instruction). On the “x” axis you can see the several division instructions selected for this comparison.
Here is a small glossary for understanding Figure 7 if you are not familiar with CPU instructions:
- int = Integer
- SP = Single Precision (32-bit numbers)
- DP = Double Precision (64-bit numbers)
- EP = Double Extended Precision (80-bit numbers)
Figure 7: Performance comparison of the new divider engine used on Penryn Core.
Super Shuffle Engine
This is an enhancement on the way the CPU floating-point unit (FPU) handles shuffle operations used by SSE data formatting instructions, allowing Penryn-based CPUs to perform some instructions in less clock cycles compared to the core currently used by Core 2 Duo processors (Merom).
In Figure 8, you can see a comparison between the number of clock cycles these two cores take to perform each one of these instructions. The smaller the bars, the better – less clock cycles means less time spend, thus higher speed.
As you can see, several 128-bit SSE instructions that took more than one clock cycle to be processed are now processed in just one clock cycle, improving SSE performance. SSE (Streaming SIMD Extensions) is used by multimedia applications that implement this kind of instruction.
Figure 8: Performance comparison of the new shuffle engine used on Penryn Core.
