Larrabee: Intel's New GPU

In Detail: The Scalar Unit And SMT

Now, let’s look at the cores in detail. As we said, they’re based on the Pentium’s design, while Intel has also made some significant modifications. The legacy of the P54C is undeniable in the scalar unit, which uses Pentium’s superscalar execution pipeline with two units, U and V.

The first is capable of executing all scalar x86 instructions, while the second is limited to a fairly complete subset (excluding, for example, complex arithmetic and logical instructions like multiplication and division). However, Intel has made several modifications to the Pentium core. First of all the engineers added 64-bit support, and they also added several instructions for controlling the level two cache memory. These instructions are especially important with streaming-type applications that don’t follow the principle of temporal locality found in traditional applications. That is, once an operation has been executed for the data, they’re certain not to be used again within a short period of time.

This behavior tends to prove disastrous with the LRU algorithm cache memories use, which will spend its time discarding important data to cache data that will be used only once. Aware of this problem, the Larrabee’s engineers added instructions for marking lines of cache data as a low priority, indicating that the data in them can be replaced as soon as they’ve been accessed. In this way, Intel has combined the best of both worlds: scratchpad-type (buffer memory) operation and the transparence of a standard cache memory, with a mechanism for coherence among the caches of the different cores.

Another change consisted of adding Simultaneous Multithreading (SMT). This technology has just made a comeback in Intel architectures with the Core i7, and is built into the Larrabee processors, where its importance is increased by the in-order nature of their cores. Modern CPUs are capable of re-organizing the execution of instructions to maximize use of the calculating units, which the Larrabee cores can’t do. Consequently, certain sequences of code can make very little use of resources, but by interlacing several threads, it’s possible to increase that use at a lower cost. If instruction one blocks execution of instruction two of thread A, then all you do is switch threads and execute instruction one on thread B.

The engineers have enabled execution of four threads per core, obviously with separate registers for each. Using four threads also enables the latency of access to the level one cache memory to be covered. In order not to diminish the efficiency of the L1 instruction and data caches, their size was increased from 8 KB each on the Pentium to 32 KB for the Larrabee cores.

Create a new thread in the US Reviews comments forum about this subject
This thread is closed for comments
95 comments
    Your comment
  • very interesting, i know nvidia cant settle for being the second best. As always its good for the consumer.
    0
  • Yes interesting, but intel already makes like 50% of every gpu i rather not see them take more market share and push nvidia and amd out although i doubt it unless they can make a real performer, which i have no doubt on paper they can but with drivers etc i doubt it.
    6
  • I wonder if their aim is to compete to appeal to the gamer market to run high end games?
    0
  • Very interesting, finally some more information about Intel upcoming "GPU".
    But as I sad before here if the drivers aren't good, even the best hardware design is for nothing. I hope Intel invests more on to the software side of things and will be nice to have a third player.
    0
  • cool ill wait for windows 7 for my next build and hope to see some directx 11 and openGL3 support by then.
    0
  • Maybe there is more than a little commonality with the Atom CPUs: in-order execution, hyper threading, low power/small foot print.

    Does the duo-core NV330 have the same sort of ring architecture?
    0
  • "Simultaneous Multithreading (SMT). This technology has just made a comeback in Intel architectures with the Core i7, and is built into the Larrabee processors."

    just thought i'd point out that with the current amd vs intel fight..if intel takes away the x86 licence amd will take its multithreading and ht tech back leaving intel without a cpu and a useless gpu
    -10
  • Driver. If Intel made driver as bad as Intel Extreme than event if Intel can make faster and cheaper GPU it will be useless.
    2
  • Hope for an Omega Drivers equivalent lol?
    3
  • Damn, hoped there would be some pictures :(. Looks interesting, I didn't read the full article but I hope it is cheaper so some of my friends with reg desktps can join in some Orginal Hardcore PC Gaming XD.
    1
  • I was quite suprised by the quality of this article and am quite eager to see the follow up.
    9
  • Well I am looking forward to Larrabee but I'll keep my optimisim under wraps until I start seeing some screenshots of Larabee in action playing real games i.e. not Intel demo's.

    I wonder just how compatible larrabee is going to be with older games?
    1
  • Great article! Keep ones like this coming!
    3
  • IzzyCraftHope for an Omega Drivers equivalent lol?



    That would be FANTASTIC! Maybe the same people who make the Omega drivers could make alternate Larrabee drivers? We all know Intel sucks balls at drivers.
    -2
  • So this is Intel's approach to a GPU... we put lots of simple x86 cores in it , add SMT and vector operations and hope that they would do the job of a GPU. IMHO Larrabee will be a complete failure as GPU but as an x86 CPU that is highly parallel this thing could screw AMD's FireStream and NVIDIA's CUDA (OPENCL too) beacause it's x86 and the programming is pretty popular for this kind of architecture.
    7
  • IzzyCraftYes interesting, but intel already makes like 50% of every gpu i rather not see them take more market share and push nvidia and amd out although i doubt it unless they can make a real performer, which i have no doubt on paper they can but with drivers etc i doubt it.

    Yeah but that 50% includes all the integrated cards that no consumer even realizes they're buying most of the time.. but not in discrete cards. I'd like to see a bit more competition on the discrete side.
    0
  • wtfnl"Simultaneous Multithreading (SMT). This technology has just made a comeback in Intel architectures with the Core i7, and is built into the Larrabee processors." just thought i'd point out that with the current amd vs intel fight..if intel takes away the x86 licence amd will take its multithreading and ht tech back leaving intel without a cpu and a useless gpu


    Umm, what makes you think that AMD pioneered multi-threading? And Intel doesnt use HyperTransport, so they cant take it away.
    2
  • Now we know what they're trying to do with it. There's still no indication if it will work or not.

    I really don't see the 1st gen. being successful-it's not like AMD and nVidia are goofing around waiting for Intel to join up and show them a real GPU. Although there's no numbers on this that I've seen, I'm thinking Larry's going to have a pretty big die size to fit all those mini-cores so it better perform, because it will cost a decent sum.
    1
  • I would mention ... "but will it play crysis" but I am not sure how funny that is anymore.
    8
  • Can't wait for Larrabee; hopefully a single Larrabee can have the performance of 295. Nvidia and ATI are slacking as they know they can price fixing and stop coming out with better GPU, just more cards with the same old GPU.
    -4