Fast Communication: The Ring Bus
To manage cache coherence, communication between the different processors, and access to fixed units like the texture units, Intel has designed a fairly classic ring bus. This type of topology has become rather familiar recently, in the Cell processor and in certain AMD GPUs (X1800, X1900, etc.), for example, since it greatly simplifies the system of interconnections when the amount of data in the bus gets increases.
Intel has given Larrabee two 512-bit buses, one in each direction, to limit communication latency. However, that solution isn’t sufficient to avoid latency reaching problematic levels when the number of processors increases beyond a certain point, and so in Larrabee implementations using more than 16 cores, there are several, shorter ring buses (probably serving eight cores only).
We might point out in passing that Intel’s diagram of Larrabee is not quite exact. To avoid needless complication, Intel has put the memory controllers on either side of the chip, and all the texture units at the left. In practice, the texture units and memory controllers will be distributed around the periphery of the ring, rather then being all in the same place. Obviously, that’s to avoid problems with congestion with a configuration like the one shown in the diagram.