Sign in with
Sign up | Sign in

Intel Releases Itanium 9500 Poulson Manual

By - Source: CPUWorld | B 33 comments

Intel has quietly published a reference manual of its upcoming Poulson Itanium processors.

First spotted by the folks over at CPU World, the document provides detailed information about the 32 nm, 3.1 billion transistor CPUs.

According to the 512-page manual, there will be four eight-core versions of Poulson, the 9520, 9540, 9550 and 9560 models. The chip will replace the Tukwila-based Itanium 9300 quad-core series that was introduced in February 2010 and is manufactured in 65 nm. The die size will be 544 mm2, which is considerably smaller than the 699 mm2 of its predecessor.

Among the new features of the processor is Intel's Instruction Replay Technology, which allows the CPU to recover from pipeline errors much faster as the execution does not rely on an entire pipeline flush, but simply restarts at the last known correct position.

The document also provides information about dual domain (front-end/back-end) hyper-threading, which makes its debut with Poulson. According to the manufacturer, the expanded hyper-threading approach, which will still allow two threads per physical processor, and the decoupled pipeline enable instruction fetch and instruction execution to operate independently and run much more efficiently: For example, the front-end can perform instruction fetch for either thread regardless of which thread the back-end is executing.

Intel was originally expected to launch Poulson sometime in Q2 of 2012.

Display 33 Comments.
This thread is closed for comments
Top Comments
  • 13 Hide
    fancarolina , July 12, 2012 6:12 AM
    "First spotted by the folks over at CPU World, the document provides detailed information about the 32 nm, 3.1 billion processor CPUs."

    Processor ... I'm pretty sure that should be transistor.
Other Comments
  • 13 Hide
    fancarolina , July 12, 2012 6:12 AM
    "First spotted by the folks over at CPU World, the document provides detailed information about the 32 nm, 3.1 billion processor CPUs."

    Processor ... I'm pretty sure that should be transistor.
  • 5 Hide
    palladin9479 , July 12, 2012 6:39 AM
    Itanium is low priority for development funds, it tends to be one to two years behind Intel's desktop offerings. And honestly I don't blame them, HP is the only one developing systems for Itaniums. Itanium doesn't compete with AMD it competes with IBM, Oracle (Sun) and their like in the HPC and enterprise world. These are specialized systems running very specific tasks with high cost software, Itanium's VLIW ISA just gets tore up. VLIW is bad design for a general purpose main CPU, good design for a DSP and GPU though.
  • 8 Hide
    The Greater Good , July 12, 2012 7:08 AM
    His name... was Robert Poulson.
  • 0 Hide
    blazorthon , July 12, 2012 7:57 AM
    palladin9479Itanium is low priority for development funds, it tends to be one to two years behind Intel's desktop offerings. And honestly I don't blame them, HP is the only one developing systems for Itaniums. Itanium doesn't compete with AMD it competes with IBM, Oracle (Sun) and their like in the HPC and enterprise world. These are specialized systems running very specific tasks with high cost software, Itanium's VLIW ISA just gets tore up. VLIW is bad design for a general purpose main CPU, good design for a DSP and GPU though.


    Itanium uses EPIC, not VLIW. It's similar, but not the same, granted it still doesn't do too well. VLIW and it's direct derivatives seem to be bad for any workload that isn't embarrassingly parallel and without dependencies, not just poorly in general-purpose GPU workloads. Well, that is if AMD's VLIW GPUs are to be considered.
  • 6 Hide
    mayne92 , July 12, 2012 7:58 AM
    Fancarolina"First spotted by the folks over at CPU World, the document provides detailed information about the 32 nm, 3.1 billion processor CPUs."Processor ... I'm pretty sure that should be transistor.

    Yeah, I found that line interesting at first.

    Who's the editor for Toms again?
  • -3 Hide
    palladin9479 , July 12, 2012 8:12 AM
    Quote:
    Itanium uses EPIC, not VLIW. It's similar, but not the same, granted it still doesn't do too well. VLIW and it's direct derivatives seem to be bad for any workload that isn't embarrassingly parallel and without dependencies, not just poorly in general-purpose GPU workloads. Well, that is if AMD's VLIW GPUs are to be considered.


    EPIC is just HP / Intel's implementation of the VLIW design concept.

    http://en.wikipedia.org/wiki/Very_long_instruction_word

    Quote:
    Very long instruction word or VLIW refers to a processor architecture designed to take advantage of instruction level parallelism (ILP). Whereas conventional processors mostly only allow programs that specify instructions to be executed one after another, a VLIW processor allows programs that can explicitly specify instructions to be executed at the same time (i.e. in parallel). This type of processor architecture is intended to allow higher performance without the inherent complexity of some other approaches.

    Traditional approaches to improving performance in processor architectures include breaking up instructions into sub-steps so that instructions can be executed partially at the same time (pipelining), dispatching individual instructions to be executed completely independently in different parts of the processor (superscalar architectures), and even executing instructions in an order different from the program (out-of-order execution). These approaches all involve increased hardware complexity (higher cost, larger circuits, higher power consumption) because the processor must intrinsically make all of the decisions internally for these approaches to work. The VLIW approach, by contrast, depends on the programs themselves providing all the decisions regarding which instructions are to be executed simultaneously and how conflicts are to be resolved. As a practical matter this means that the compiler software (the software used to create the final programs) becomes much more complex, but the hardware is simpler than many other approaches to parallelism.


    Any ISA that pass's multiple instructions as a single word to the CPU is a VLIW ISA. VLIW isn't a set of instructions, its just a concept / technique to implement ILP. This is also why GPU's are VLIW processors, even if they use different terminology's their still implementing multiple instructions as a single instruction word. Nearly every DSP in the world is also a VLIW CPU, for these same reasons.
  • 1 Hide
    victorious 3930k , July 12, 2012 8:17 AM
    Wait, didn't they give up on Itanium?
  • 2 Hide
    palladin9479 , July 12, 2012 8:52 AM
    Quote:
    Wait, didn't they give up on Itanium?


    No, although its market share is rather low. HP is the only OEM really building Itanium systems and the overwhelmingly run HP-UX. Mostly banks and other financial institutions have special software written for those HP systems.

    The HPC world is very very slow to change, you tend to swap our your core hardware once every five to seven years and then you only swap it with stuff that has been out for a year or more (reliability is a major factor). The hardware cost is minor compared to the software and support costs.
  • 7 Hide
    QEFX , July 12, 2012 10:09 AM
    mayne92Yeah, I found that line interesting at first.Who's the editor for Toms again?


    Tom's has editors?
  • -4 Hide
    victorious 3930k , July 12, 2012 10:38 AM
    palladin9479No, although its market share is rather low. HP is the only OEM really building Itanium systems and the overwhelmingly run HP-UX. Mostly banks and other financial institutions have special software written for those HP systems.The HPC world is very very slow to change, you tend to swap our your core hardware once every five to seven years and then you only swap it with stuff that has been out for a year or more (reliability is a major factor). The hardware cost is minor compared to the software and support costs.

    Why Itanium over x86 though?
  • -3 Hide
    silverblue , July 12, 2012 10:56 AM
    blazorthonVLIW and it's direct derivatives seem to be bad for any workload that isn't embarrassingly parallel and without dependencies


    Reminds me a bit too much of Bulldozer. ;) 
  • 0 Hide
    freggo , July 12, 2012 2:05 PM
    "3.1 billion processor CPUs..."

    Damn, I just bought an i5 based QUAD core box and I am still not even close to being able to play with the cool kids :-)


  • 1 Hide
    fb39ca4 , July 12, 2012 3:21 PM
    Damn, thats a lotta cores XD
  • 0 Hide
    syrious1 , July 12, 2012 3:22 PM
    His name is Robert Poulson.
  • 3 Hide
    ashinms , July 12, 2012 3:33 PM
    silverblueReminds me a bit too much of Bulldozer.


    Not even close.
  • 0 Hide
    silverblue , July 12, 2012 6:38 PM
    Oh dear... where's the sense of humour here?

    The point I was alluding to was Bulldozer's inability to perform satisfactorily outside of heavy multitasking. That much has been proven. So yeah, it was a tenuous link, but I'm not quite sure it was completely without merit.
  • 1 Hide
    blazorthon , July 12, 2012 6:47 PM
    silverblueOh dear... where's the sense of humour here?The point I was alluding to was Bulldozer's inability to perform satisfactorily outside of heavy multitasking. That much has been proven. So yeah, it was a tenuous link, but I'm not quite sure it was completely without merit.


    It can perform very well outside of heavy multi-tasking if you disable one core per module and then overclcok it. The FX-8120 and the FX-6100 are excellent CPUs to do this with. A 20-30% performance boost in performance per Hz (25%-30% being most common) and an easy path to 5GHz even on the stock cooler lets them fight with the non K edition i5s in performance per core, granted they are not as power efficient (still far better than not disabling one core per module and for multiple reasons).
  • -1 Hide
    blazorthon , July 12, 2012 7:14 PM
    palladin9479EPIC is just HP / Intel's implementation of the VLIW design concept.http://en.wikipedia.org/wiki/Very_ [...] ction_wordAny ISA that pass's multiple instructions as a single word to the CPU is a VLIW ISA. VLIW isn't a set of instructions, its just a concept / technique to implement ILP. This is also why GPU's are VLIW processors, even if they use different terminology's their still implementing multiple instructions as a single instruction word. Nearly every DSP in the world is also a VLIW CPU, for these same reasons.


    That's more of an argument of semantics than anything. There are other arguably minor details inherent to VLIW processors, so it's not entirely true either. You're taking the name a little too literally.

    [nomWikipedia]Outside embedded processing markets, Intel's Itanium IA-64 EPIC appears as the only example of a widely used VLIW CPU architecture. However, EPIC architecture is sometimes distinguished from a pure VLIW architecture, since EPIC advocates full instruction predication, rotating register files, and a very long instruction word that can encode non-parallel instruction groups. VLIWs also gained significant consumer penetration in the GPU market, though both Nvidia and AMD have since moved to RISC architectures in order to improve performance on non-graphics workloads.
    [/nom]

    http://en.wikipedia.org/wiki/VLIW#Implementations

    Last paragraph of this section of the same Wiki article. Also, GPUs aren't all VLIW. Sure, many of AMD's GPUs are purely VLIW, but not all of them.

    http://en.wikipedia.org/wiki/Graphics_Core_Next
  • 0 Hide
    ashinms , July 12, 2012 7:16 PM
    Shit, I think I just sparked a flame war... *facepalm*


    Either way, I was really just relating go the architectures being different. As far as bulldozer being a bad architecture, all I can say is that mine does everything unwanted it to and I have no complaints.
Display more comments