We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.

[nextpage title=”Introduction: VLIW”]

Since 1994, Intel and HP work at a 64-bit option. Its architecture should enable the CISC processors to take a big enough step to overtake the RISC processors. By using a technique called VLIW, still experimental at the time, and creating the EPIC model, they proposed the Merced architecture, which has been promised to the beginning of the year 2000. As things have changed, the processors Pentium III and IV and the Athlon have offered exceptional performance, reaching over 1 GHz, and because of this new architecture’s high price and the low availability of programs to 64 bits, the timetable has been delayed and the release of the IA-64 architecture should happen only this year.

The letters VLIW mean Very Large Instruction Word. Processors that use this technique access the memory by transferring long program words, and in each word many instructions are packed. In the case of the IA-64, three instructions are used for each pack of 128 bits. As each instruction has 41 bits, there are 5 bits left that will be used to indicate the kinds of instruction that were packed. Figure 1 shows the instruction packaging scheme. This packaging lessens the number of memory accesses, leaving to the compiler the task of grouping the instructions in order to get the best of the architecture.

IA-64Figure 1: Instruction packaging used in the IA-64 architecture.

As it has already been said, the 5-bit field, named as pointer, serves to indicate the kinds of instructions that are packed. Those 5 bits offer 32 kinds of packaging possible that, in fact, are reduced to 24 kinds, since 8 are not used. Each instruction uses one of the CPU features, which are listed below, and that can be identified in Figure 2 (on next page):

  • I Unit: integer data;
  • F Unit: floating-point operations;
  • M Unit: memory access; and
  • B Unit: branch prediction.

[nextpage title=”IA-64 Architecture”]

The architecture that Intel suggests to execute those instructions, that was called Merced (used on the Itanium processor), is versatile and promises performance by means of the simultaneous (parallel) execution of up to 6 instructions. Figure 2 shows the diagram in blocks of this architecture that uses a pipeline of 10 stages.

Intel 64-bit architecture (IA-64)Figure 2: Block diagram of the Itanium CPU (IA-64 architecture).

The IA-64 architecture receives the sigla EPIC, which means Explicit Parallel Instruction Computing. By using this sigla, Intel wants to say that the compiler will be the great responsible for determining and clearing the parallelism present in the instructions to be executed. This is a combination of concepts called speculation, predication and explicit parallelism. Next, we will briefly study each one of them.

The Instruction Level Parallelism – ILP is the ability of executing multiple instructions at the same time. As we have seen, the IA-64 architecture allows to pack independent instructions to be executed in parallel and, for each clock period, is capable of treating multiple packs. Due to the great number of features in parallel, as well as the great number of registers and multiple executing units, it is possible for the compiler to manage and program the parallel computing. The compilers used for the traditional architectures are limited in their speculative capacity because there is not always a way to be sure if the speculation will be correctly managed by the processor. The IA-64 architecture allows the compiler to explore the speculative information without sacrificing the correct execution of an application.

The IA-64 architecture has mechanisms denominated instruction pointer, suggestions for branches and cache, that allow the compiler to send to the processor information obtained during the time of compilation. That information minimizes the penalties that come from the branches and cache misses.

There are two kinds of speculation: data and control. With the speculation, the compiler advances an operation in a way that its latency (time spent) is removed from the critical way. The speculation is a form of allowing the compiler to avoid that slow operations spoil the parallelism of the instructions. Control speculation is the execution of an operation before the branch that precedes it. On the other hand, data speculation is the execution of a memory load before a storage operation (store) that precedes it and with which it can be related.

With the predication you mark with predicates all the branches of the conditional branches that, next, are sent to the execution in parallel, however only the necessary ones are executed. Therefore, it is possible to prepare the execution of the instructions even before having solved the conditional branches. Besides the removal of branches by means of predicates, IA-64 architecture has a series of mechanisms that should reduce the error in predicting the branches and the cost when this error happens.

The IA-64 architecture has a great number of registers. There are 128 integer registers, 128 floating-point registers, 64 predicate registers of 1 bit, and many other registers for configuration, management and monitoring of the CPU’s performance.

[nextpage title=”IA-32 Compability”]

To end, we see that Intel promises compatibility with the 32-bit software (IA-32). They should run without any change since the operating system and the firmware have features for that. It should be possible to run software in real mode (16 bits), protected mode (32 bits) and virtual mode 86 (16 bits). They mean that the CPU will be able to operate in IA-64 mode or IA-32 mode. There are special instructions to go from one mode to the other, as it is shown in Figure 3.

IA-32 on IA-64Figure 3: Model of instruction sets transition.

The three instructions that make the transition between the instruction sets are:

  • JMPE (IA-32): jumps to a 64-bit instruction and changes to IA-64 mode;
  • br.ia (IA-64): moves to a 32-bit instruction and changes to IA-32 mode;
  • Interruptions transit to IA-64 mode, allowing the fulfillment of all interruption conditions and
  • rfi (IA-64): it is the return of the interruption; the return happens both to an IA-32 situation and to an IA-64, depending on the situation present at the moment when the interruption is invoked.

[nextpage title=”Conclusion”]

With this article and the other about 64-bit technology from AMD, we finished to talk about the processors for the beginning of the millenium. In addition, it is important to mention that there already are computers running 64-bit versions of Windows and Linux. Now, more than performance, our biggest concern is the compatibility with our present programs. We really have to verify how much those 64-bit architectures are compatible with our 32- or 16-bit programs. We hope that in less than a year we already have the answer to this question. To finish this part of 64-bit CPUs, it is very good to see how the two companies compete on the market of high performance processors. This grants us access to even cheaper and better computers.

To conclude, we would like to comment the great space that there still is to the evolution of electronics and consequently to the evolution of computers. More important than the creation of supercomputers, this new age will see the permeability of the computers. It will be the time of invisible computers. They will be present in nearly all modern devices. At the moment they inhabit our TV sets, microwave ovens, cars, watches, stereos, DVD, etc… In a near future, they will invade the refrigerator, the toaster, the air-conditioner and all everyday appliances. We have gone beyond the cheap electronics age and we are entering the cheap intelligence age.