The last interesting point about this unit is the existence of a mask register, which is somewhat similar to the filter register in the VMX instruction set. This register, which contains 16 Boolean values, indicates whether or not the result values are to be written to the destination register. The mask enables the use of a technique called predication. In the case of an if-then-else test, rather than trying to predict the result of the test to continue execution of the program without loss of performance, both branches are executed in parallel and only the appropriate one is kept when this register is used. When the code to be executed is in relatively short parts, this is more efficient because it avoids the risk of wrong branch predictions.
This unit is clearly the most interesting aspect of Larrabee. However, one can have a few reservations about the choice of a new SIMD instruction set. Admittedly, the SSE instruction set, which is aging and was designed to have low hardware impact, wasn’t suitable. But we also know that the teams at Intel are working on a new SIMD instruction set called advanced vector instructions (AVX). The latter has support for instructions with three operands and for MAD instructions and increases the size of the vectors it processes to 256 bits compared to 128 bits for SSE (and 512 bits for Larrabee).
It’s perfectly conceivable that Larrabee, due to its specificities, needed “exotic” instructions that have no place in a traditional CPU, and that the size of the AVX vectors was too limited. Whereas, conversely, 512-bit vectors were too much of a constraint on a standard CPU. But in practice, Intel’s language is a little contradictory. On the one hand, it points out that Larrabee supports the x86 instruction set, making it compatible with an enormous quantity of software. But on the other hand, to really make the most out of Larrabee, a new, specific instruction set is needed--one that won’t be used in other Intel CPUs.