[nextpage title=”Introduction”]

NVidia has just released their new GeForce 8 series, formerly known by its codename, G80. This new series uses a completely different architecture from all other graphic chips, using a unified shader engine. In this article we will explain in details all that is new on this new graphics chip series.

GeForce 8 is the first series to support the forthcoming DirectX 10 (Shader 4.0 model). So first let’s talk about what is new in DirectX 10.

One of the new things on DirectX 10 is geometry processing. Until now the system CPU was in charge of this task. On DirectX 10 geometry processing can now be done by the graphics chip. Geometry processing smoothes curved surfaces and you can see a better quality on character animation, facial expressions and hair.

DirectX 10 provides more resources to the GPU, improving 3D performance. The key differences in GPU resources between DirectX 9 and DirectX 10 you can see in the table below.

Resources DirectX 9 DirectX 10
Temporary Registers 32 4,096
Constant Registers 256 16 x 4,096
Textures 16 128
Render Targets 4 8
Maximum Texture Size 4,048 x 4,048 8,096 x 8,096

In the table below you see a comparison between shader models 1.0 (DirectX 8.1), 2.0 (DirectX 9.0), 3.0 (DirectX 9.0c) and 4.0 (DirectX 10).

  Shader 1.x Shader 2.0 Shader 3.0 Shader 4.0
Vertex Instructions 128 256 512 65,536 *
Pixel Instructions 4+8 32+64 512 65,536 *
Vertex Constants 96 256 256 16 x 4,096 *
Pixel Constants  8 32 224 16 x 4,096 *
Vertex Temps 16 16 16 4,096 *
Pixel Temps 2 12 32 4,096 *
Vertex Inputs 16 16 16 16
Pixel Inputs 4+2 8+2 10 32
Render Targets 1 4 4 8
Vertex Textures 4 128 *
Pixel Textures 8 16 16 128 *
2D Texture Size 2,048 x 2,048 8,192 x 8,192
Int Ops Yes
Load Ops Yes
Derivatives Yes Yes
Vertex Flow Control Static Static/Dynamic Dynamic *
Pixel Flow Control Static/Dynamic Dynamic *

* As DirectX 10 implements unfied architecture as we will discuss in the next page, this number is the total for the whole unified architecture and not for this individual spec.

Besides the increase in resources capacity, there are several new features on DirectX 10, but in summary the goal of this new programming model is to reduce the system CPU role on 3D graphics performance – i.e., it tries to avoid the use of the system CPU as much as possible.

Of course we will have to wait until DirectX 10 games are released to see these new features in action. Some of the games that will be released based on DirectX 10 include Crysis and Hellgate London.

But the main difference between GeForce 8 series and all other graphic chips available today is its unified shader engine, another feature introduced by Shader 4.0 programming model. Let’s talk more about it.

[nextpage title=”Architecture”]

Instead of the graphics chip having separated shader engines according to the task to be performed – for instance, separated pixel shader and vertex shader units – the GPU has now just one big engine that can be programmed on the fly according to the task to be done: pixel shader, vertex shader, geometry and physics. This new architecture also makes it easier to add new shader types in the future.

The reason behind unified architecture is that in some situations the GPU was using all shader engines from one type (pixel shader engines, for example) and even queuing tasks for these engines while engines from another type (vertex shader engines, for example) were idle but cannot be used to perform a different task, since they were dedicated to a specific procedure type.

So Shader 4.0 allows the use of any shader engine by any shader process – pixel, vertex, geometry and physics.

In Figure 1, you can see the block diagram for GeForce 8800 GTX, the most high-end model on the new GeForce 8 series. As you can see, this GPU has eight shader units and none of them are dedicated to a specific task. Each shader unit has 16 streaming processors (the green boxes labeled SP), eight texture filtering units (the blue boxes labeled TF), four texture address units (not drawn in Figure 1) and one L1 memory cache (the orange box). This GPU also has six memory interface busses, each one are 64-bit wide and has its own L2 memory cache. The streaming processors work with a different (higher) clock rate.

GeForce 8800 GTXFigure 1: GeForce 8800 GTX block diagram.

So GeForce 8800 GTX has 128 shader engines (i.e., streaming processors; 16 streaming processors x 8 units) and 384-bit memory interface (64 bits x 6).

GeForce 8800 GTS, another GeForce 8 model that was released, has six shader units and five 64-bit memory interface busses, so it has 96 shader engines (16 streaming processors x 6 units) and 320-bit memory interface (64 bits x 5).

We will discuss the technical specs for these two chips in a while.

Other features found on GeForce 8 series include:

  • Support for thousand of independent simultaneous threads, technology NVIDIA calls GigaThread.
  • 16x anti-aliasing and 128-bit floating point precision HDR (High Dynamic Range) – GeForce 6 and 7 use a 64-bit precision engine. NVidia is calling these two technologies as “Lumenex”.
  • Nvidia is calling their physics simulation technology as “Quantum Effects”. Physics simulation improves the reality of effects like smoke, fire and explosions.
  • Hardware-based features to enhance 2D high definition video quality. Nvidia calls these enhancements as “PureVideo HD”. Two new features are provided on GeForce 8 series, HD noise reduction and HD edge enhancement. GeForce 8 series also support HDCP decryption, in order to play HD-DVD and Blue-ray that have this technology on PCs. Of course GeForce 8 supports all other enhancements provided on previous NVIDIA chips, like de-interlacing, high-quality scaling, inverse telecine and bad edit correction.
  • Recommended for resolutions starting at 1600 x 1200.

[nextpage title=”Released Models”]

Two GeForce 8 models were released this week: GeForce 8800 GTX and GeForce 8800 GTS, both targeted to PCI Express x16. GeForce 8800 GTX requires two auxiliary power connectors and has two SLI connectors (we are asking ourselves if this isn’t a future support for SLI mode with four video cards).

As we mentioned before, the shader engines use a different clock rate from the rest of the GPU.

We list below the main specs for these two new video cards.

  GeForce 8800 GTX GeForce 8800 GTS
Core clock 575 MHz 500 MHz
Streaming Processors (Shader Engines) 128 96
Streaming Processors Clock 1.35 GHz 1.2 GHz
Memory Clock 1.8 GHz 1.6 GHz
Memory Capacity 768 MB 640 MB
Memory Interface 384-bit 320-bit
MSRP USD 599 USD 499