[nextpage title=”Introduction: AMD x86-64 Architecture”]

We will talk about the 64-bit processors that will drive our next personal computers forward. The projects by Intel and AMD are innovative and, if the promises are kept, in less than a year they should be on store shelves. These new architectures promise to take parallelism even farther, providing mechanisms for the compilers to pass to the CPUs not only efficiently organized instructions, but mainly how and which can be executed in parallel.

When we talk about 64-bit CPUs, it is important to elucidate a certain confusion that is made with present processors. We should have it clear in our minds that all the present processors, Intel or AMD, are 32-bit CPUs. The Pentium 4 or the Athlon have 64-bit data bus, but the CPU architecture is of 32 bits. In this article, we will see what the AMD is planning for its 64-bit architecture.

AMD x86-64 Architecture

We start with a question: how to make the transition from the 32-bit CPUs to the 64-bit one? AMD is answering this question with an architecture that, besides the 64-bit environment, promises compatibility with al the programs developed to 16 and 32 bits. The aim is to offer a low cost solution for the users to make this transition in a very easy way. With an architecture compatible to the x86 world, the board and software manufacturers and the users can manage their investments more easily. The idea is to offer a secure bridge for the transition from 32 to 64 bits. The 64 -computing is directed to applications that are very hungry for memory, such as the great databases, the CAD tools and the simulations that, according to the present features, are limited by the 4 GB address space.

Very little time ago, people used to say that the RISC CPUs would definitely go beyond the CISC architectures. However, it did not happen and the present CISC computers got on the same footing with the RISC, in terms of integer data operations, and have already reduced a lot the disadvantage they had in operations with floating point. That is why, states AMD, the next performance gains will have more to do with the implementing techniques (e.g.: parallelism) than with the instruction set: RISC, CISC-64 or VLIW. In fact, there is a certain abuse with these names, because the present x86 CPUs have only one CISC external layer, and its core was formed by RISC computers.

AMD is calling its new architecture x86-64 and it will be started with a family of processors that have the code name Hammer (the first project, code-named clawhammer, was released as Athlon 64). The 64-bit strategy by AMD is the extension of the present x86 CPUs to work at 64 bits, with the introduction of the so called Long Mode. This solution is safe because it has already been employed at the time of the transition from 16 bits (8088 and 286 CPUs) to 32 bits (386 CPUs and forward). Since long ago, the 32-bit CPUs operate in two modes. When in real mode, they become like the old 8088, but, when in protected mode, they offer 32-bit features, with task and memory managers. The x86-64 architecture offers a new mode called Long Mode, which serves for setting the CPU to operate at 64 bits. When in long mode, besides the 64-bit features, registers extended to 64 bits are offered and, besides that, new registers have been added. Let’s go to the study of this new mode.

[nextpage title=”Operating Modes”]

The long mode is activated through a control bit called LMA (Long Mode Active). When the LMA is inactivated, the processor will operate in the standard x86 mode and will be compatible to the operating systems and applications of 16 and 32 bits, that is, it will be compatible with everything that exists nowadays. When the LMA is activated (long mode), the 64-bit extension will be enabled, offering a new 64-bit CPU. The long mode is also divided into two sub-modes: the 64-bit mode and the compatible mode. These two sub-modes are controlled by the D and L bits, present at the descriptor pointed by the CS (Code Segment) register. The compatible mode is interesting because it allows, as a program, to run 16- or 32-bit applications inside the 64-bit mode. It is something similar to the virtual mode 86 of the 386 processors. Figure 1 shows an explanation of those modes.

  CS.L=0 CS.D=0 CS.L=0 CS.D=1 CS.L=1 CS.D=0 CS.L=1 CS.D=1
LMA=0
Legal Mode
Standard 16-bit Mode Standard 32-bit Mode Standard 16-bit Mode Standard 32-bit Mode
LMA=1
Long Mode Active
16-bit Compatibility Mode 32-bit Compatibility Mode 64-bit Mode Reserved


Figure 1: Operation modes of the x86-64 family.

The old x86 mode (32 or 16 bits) is called Legal Mode (LMA=0). When in this mode, the x86-64 CPUs can work with data of 16 or 32 bits. Notice that, when in this mode, the state of the CS.L bit has no meaning, that is, it is a don’t care.

When put in 64-bit Long Mode (LMA=1, CS.L=1 and CS.D=0), the standard size of the operand is of 32 bits and the standard size for addressing is of 64 bits. With the use of instruction prefixes, the size of the operand can be altered to 64 or 16 bits and the size of the address to 32 bits.

If set in the Long Mode, Compatibility Mode (LMA=1 and CS.L=0), we have the binary compatibility with the applications written in x86 16 and 32 bits. That is really interesting, because an operating system in long mode can run the present 16- and 32-bit programs only by setting to zero the CS.L bit of the descriptor pointed by the code segment of those applications. In the compatible sub-mode, the CS.D bit continues to select between the 16- and 32-bit modes. It should also be noticed that, when the processor is in legal mode, the state of the CS.L bit has no meaning, that is, it is a don’t care. Figure 2 shows the programming details of this new architecture.

 

Mode Operating System Recompilation Necessity Address Lenght Operand Size Extended Registers Registers Size
Long Mode – 64-bit Mode New 64-bit O.S. Yes 64 bits 32 bits Yes 64 bits
Long Mode – Compatibility Mode New 64-bit O.S. No 32 or 16 bits 32 bits No 32 bits
Legal Mode 32-bit O.S. No 32 bits 32 bits No 32 bits
Legal Mode 16-bit O.S. No 16 bits 16 bits No 32 bits

Figure 2: Programming characteristics of the AMD x86-64 architecture.

[nextpage title=”Long Mode Details”]

Let’s now take a closer look at the long mode. As we have seen before, the Long Mode allows the use of the 64-bit features, at the same time it offers the Compatible sub-mode to run 16- or 32-bit applications. This mode brings a great deal of features, which are following listed:

  • Virtual 64-bit addressing;
  • Registers extended to 64 bits;
  • Addition of 8 registers (R8-R15);
  • Addition of 8 registers for SIMD (XMM8-XMM15);
  • 64-bit instruction pointer;
  • Flat addressing mode.

The addition of new registers for SIMD operations make a total of 16 multimedia registers available. The new general purpose registers come to reduce a little one of the weaknesses of the x86 architecture, which is the small number of registers.

To better define its registers logic, AMD has simply extended the scheme used for the 16- and 32-bit registers. So, it is still possible to access in fractionated way the registers inherited from the old 8086. For example, the RAX register can be accessed as a single 64-bit block, but it is also possible to access only its inferior half through the EAX register. Besides that, a portion of 16 bits (AX) and two portions of 8 bits (AH and AL) are also accessible. Of course the AX is formed by the juxtaposition of the registers AH with AL. This way, all the compatibility with the old x86 environments is kept. Figure 3 shows these possible fractions.

x86-64 Register ConfigurationFigure 3: x86-64 architecture register fractioning scheme.

The advances in integration technology and the increase of the clock speed should make a better performance of those CPUs possible, even when operating in the Legal Mode. With this architecture, AMD hopes to offer an easy way for the transition from 32 to 64 bits. In the past, a transition not so easy due to the new memory model, allowed an evolution from the 16 bits (8086 and 286) to the 32 bits (386 and forward). AMD bets that, instead of changing the architecture completely, the success will be with the one that keeps the compatibility.