Recompile code developed for the Itanium processor before using it on the Itanium 2 processor to take advantage of the advanced microarchitectural features of the platform. The Intel Itanium 2 processor still disperses a maximum of six instructions per clock cycle, but the compiler has greater choice in matching instructions to the execution units. The following figure is a matrix comparing the possible full-dispersal bundle-pair configurations supported by the Itanium processor and the Intel Itanium 2 processor:
The templates down the left column represent the first bundle in the dispersals window, and those across the top represent the second bundle. Bundle configurations dispersed by either the Itanium processor or the Intel Itanium 2 processor are highlighted in orange; those that are only permitted for the Intel Itanium 2 processor are in green. The availability of six memory-integer execution resources more than doubles the bundle combinations that represent full dispersal for the Intel Itanium 2 processor. Code generated for the Intel Itanium 2 processor has less
nops, fewer split dispersals, and a boost in performance.
The Itanium processor does not support full dispersal of the MII-MII code sequence that follows:
The second bundle is compiled with an alternate template and filled with
nops, which results in the issue of just four instructions during a clock cycle. The dispersal matrix identifies that the Intel Itanium 2 processor supports full dispersal for this bundle-pair combination. The following figure illustrates how the Intel Itanium 2 processor disperses the individual instructions for execution in the same clock cycle:
Code compiled for the original Itanium processor executes on future family members without recompiling. However, recompiling makes better use of the additional architectural resources. The result is higher performance.