Sign in with
Sign up | Sign in

Intel to Share Next-Gen Poulson CPU Details

By - Source: ISSCC | B 28 comments

Itanium is still breathing, it seems. Poulson, which will follow the current Tukwila core, is expected to be released sometime next year, but Intel is apparently ready to share some architectural details now. The processor will integrate eight cores and a total of 3.1 billion transistors on a die that measures 544 mm2, according to the program information released by the ISSCC.

Intel says the on-die cache grows to a combined 50 MB and the processor-to-processor links provide a bandwidth of up to 128 GB/s, while the memory bandwidth is 45 GB/s. The on-die cache seems to a bit smaller than the 54 MB that Intel discussed in the past. We should note that the 32 nm Poulson has a significantly smaller die size than the 65 nm Tukwila, which squeezes four cores in 699 mm2.

ISSCC 2011 opens its doors on February 20 in San Francisco.       

Display 28 Comments.
This thread is closed for comments
  • 0 Hide
    Anonymous , January 26, 2011 4:02 PM
    Is this a server processor or the next step after Sandy Bridge?
  • -1 Hide
    BulkZerker , January 26, 2011 4:07 PM
    What they won't tell you is this processor will set you back $1500 O.O
  • 2 Hide
    dgingeri , January 26, 2011 4:39 PM
    50MB of cache.

    32k X 8 l1 cache = 256k
    256k X 8 l2 cache = 2MB
    so the remaining cache is 48MB of l3 cache?? That's freaking huge! These definitely wouldn't be for desktop use. These are server chips.
  • 2 Hide
    leo2kp , January 26, 2011 4:41 PM
    BulkZerkerWhat they won't tell you is this processor will set you back $1500 O.O


    ...which is a bargain for server applications ;) 
  • 2 Hide
    one-shot , January 26, 2011 4:49 PM
    BulkZerkerWhat they won't tell you is this processor will set you back $1500 O.O


    I'm sure it will be several times $1500.
  • 0 Hide
    one-shot , January 26, 2011 4:49 PM
    timothyburgher@gmailcomIs this a server processor or the next step after Sandy Bridge?


    No, it is not. Sandy Bridge E series are. They will be very fast. This CPU is different.
  • -1 Hide
    geekapproved , January 26, 2011 5:00 PM
    Ivy Bridge is after Sandy Bridge. Stupid a$$ names if you ask me.
  • 2 Hide
    Anonymous , January 26, 2011 5:06 PM
    Itanium processors cost much more than $1500. And Itanium L1/L2 caches are much larger than 256KB/2MB. What, no one here knows or remembers what Itanium is?
  • 0 Hide
    wcooper007 , January 26, 2011 5:40 PM
    LoL Itanium is for nothing but special datacenters i wouldnt even put this in the same class as servers this thing runs super computer nodes and things of this nature. It requires a special operating system that runs the itanium instruction set i think they madea server 2003 itanium edition at one point due to it not be a x86 processor so yeah the old ones would set you back 3500 dollars and i dont know much about these new ones. when i left the eval lab at Intel back in 2003 we had tons of these things
  • 2 Hide
    Travis Beane , January 26, 2011 5:51 PM
    gfdsgfdsItanium processors cost much more than $1500. And Itanium L1/L2 caches are much larger than 256KB/2MB. What, no one here knows or remembers what Itanium is?

    Very big, very expensive and unique chips. Thought x86 was bloated? Itanium uses an EPIC instruction set (explicitly parallel instruction computing) to attempt to achieve a much higher instructions per clock ratio.

    I have absolutely no use for it, but damn, I want one.
  • -1 Hide
    saaiello , January 26, 2011 5:52 PM
    Geez this is really old new I have seen this same article at least 3 months back on another site. Toms is getting good at being late to the news. In this case really late.
  • 1 Hide
    jdamon113 , January 26, 2011 5:52 PM
    For thoese of you commenting on the price.
    itanium is not a cpu anyone person would buy. This main frame replacement.
    Ment to run a entire data wharehouse.
    so stop the pricing as if intel is trying to take advantage.
    You thing the price is high. spec out a hight end sun main frome or a cray becasue this is what itanium is ment for.
    This is not for crysis.
  • 0 Hide
    f-14 , January 26, 2011 6:01 PM
    holy christ that's huge in many respects. i'd like to see the blade this thing goes into!
  • 1 Hide
    f-14 , January 26, 2011 6:07 PM
    jdamon113For thoese of you commenting on the price.itanium is not a cpu anyone person would buy. This main frame replacement.Ment to run a entire data wharehouse.so stop the pricing as if intel is trying to take advantage.You thing the price is high. spec out a hight end sun main frome or a cray becasue this is what itanium is ment for.This is not for crysis.

    crisi on the cloud perhaps!
    that's what this thing is designed for. and i'm pretty sure zuckerburg would buy one himself. future evil dictators have to have a way to manage their covert ops and manipulation of information some how! all pun intended.
    bad example bad citing, just trying to keep my mind off the fact george soros makes dick cheney look like a two bit player at a poker game.
  • 0 Hide
    snoogins , January 26, 2011 6:17 PM
    Quote:
    The on-die cache seems to a bit smaller than the 54 MB that Intel discussed in the past.


    Should insert a 'be' after the 'to' and before the 'a'
  • 0 Hide
    thomaseron , January 26, 2011 7:20 PM
    Will it max out Crysis2? :-P
    Just kidding. :)  Huge die though...
  • 0 Hide
    formin , January 26, 2011 7:54 PM
    SSD are breaking the Gb/s mark now.
    if u up to a 256MB cache and u can probably get away 20Gb/s like to a super fast SSD and the RAM disappears.
  • -1 Hide
    ta152h , January 26, 2011 10:19 PM
    This article completely misses the most important thing about this release. The current Itanium is a 6-issue (in two bundles of three instructions each), the Poulson will be a 12-issue processor, which could mean even greater IPC than is currently possible. I'm curious if they're going to dynamically allocate bundles based on workload - if you have one thread use all the resources for one, but if you have several give each thread a bundle, or somewhere in between.

    It will be interesting to see if Intel can finally get performance superiority over the horrible x86 instruction set processors. They've always been behind in manufacturing technology, and then they did weird things like make the L1 cache accessible in only one clock cycle (severely limiting clock speed), but with manufacturing parity, and finally two clock cycle access to L1 cache, this processor had better be able to beat processors crippled by an obsolete, difficult and inefficient instruction set.

    Otherwise, they should have just gone with RISC, instead of VLIW.
  • 2 Hide
    palladin9479 , January 27, 2011 12:59 AM
    Itanium was very bad as a server CPU. It had its own special *Intel* branded instruction set that you had to compile your OS / Drivers / Applications for. It used microcode emulation to run x86 instructions so that it appeared to be x86 compatible but was horribly inefficient as running x86 instructions. Because of this the Itanium sucked for anything where you would need x86 instructions for (90+% of the commodity server market) which left the "special use" systems that run OS / apps programmed specifically for a special CPU Architecture to run specialized software.

    Itanium's competition isn't AMD / Pentium its things like Sun SPARC and IBM Power. And this is an arena where Intel gets spanked pretty badly. On one hand you have the SUN SPARC, the SUN T3 CPU which is,
    Quote:
    "A 16-core SPARC SoC processor enables up to 512 threads in a 4-way glueless system to maximize throughput. The 6MB L2 cache of 461GB/s and the 308-pin SerDes I/O of 2.4Tb/s support the required bandwidth. Six clock and four voltage domains, as well as power management and circuit techniques, optimize performance, power, variability and yield trade-offs across the 377mm2 die"


    In reality this means a single T3 CPU can process 32 integer (2 integer units per core) and 16 floating point (1 FPU per core) and 16 memory (1 MMU per core) operations per cycle. Each core has eight sets of register stacks allowing each core to process eight unique threads each. Each CPU has four DD3 memory channels to its own dedicated memory, 2 10Gb Ethernet ports and its own set of I/O circuitry. Each core has its own built in crypto circuitry for accelerating encryption and hashing. A single server would have four of these CPU's inside it along with 128GB ~ 1TB of memory depending. The only down side is that each CPU is clocked at 1.67Ghz, single thread performance is rather low compared to its IBM Power counterpart. These SPARC CPU's are designed to be used in databases and massively parallel servers, when you need to service thousands of users while processing hundreds of transactions per second, then you use a SPARC.

    http://en.wikipedia.org/wiki/SPARC_T3

    Their main competitors is IBM and their Power CPU, namely the Power 7.

    Quote:
    POWER7 has these specifications:[5][6]

    * 45 nm SOI process, 567 mm2
    * 1.2 billion transistors
    * 3.0 – 4.25 GHz clock speed
    * max 4 chips per quad-chip module
    o 4, 6 or 8 cores per chip
    + 4 SMT threads per core (available in AIX 6.1 TL05 (releases in April 2010) and above)
    + 12 execution units per core:
    # 2 fixed-point units
    # 2 load/store units
    # 4 double-precision floating-point units
    # 1 vector unit supporting VSX
    # 1 decimal floating-point unit
    # 1 branch unit
    # 1 condition register unit
    o 32+32 kB L1 instruction and data cache (per core)[7]
    o 256 kB L2 Cache (per core)
    o 4 MB L3 cache per core with maximum up to 32MB supported. The cache is implemented in eDRAM, which does not require as many transistors per cell as a standard SRAM[4] so it allows for a larger cache while using the same area as SRAM.


    What this means in reality is that while it you get four threads per core with a maximum of four simultaneous instructions executed per core. Now a note needs to be made that IBM Power / AIX instructions differ from SPARC instructions so the two are very hard to compare. Power focuses more on getting a single task done as fast as possible where the SPARC focuses on getting as many tasks done at once as possible. Powers are clocked at 3 to 3.8Ghz per CPU (can shutdown cores to boost speed to 4.25Ghz) and are many times bigger then a SPARC CPU which often leads to unfair CPU vs CPU comparisons. Better comparisons have been done with system vs system competitions and they each win at different things (T3 at webserving / database work, Power at financial calculations / simulations).

    These are the beasts that Itanium must compete against not home gaming rigs and low to medium server markets. Everyone rejected Itanium originally because of the horrible x86 performance, the commodity market doesn't want to recompile / redevelop their entire software base for a single CPU architecture.
  • 2 Hide
    palladin9479 , January 27, 2011 1:26 AM
    Ok some pricing info, I'm very familier with purchasing Sun systems so I'll list the default quote off their site for a single system.

    https://shop.sun.com/store/product/578414b2-d884-11de-9869-080020a9ed93
    Config #3,
    $177,057.00 Each,
    4x SUN Sparc T3 CPU,
    512 GB (64 x 8 GB DIMMs) Memory,
    Internal Storage: 600 GB (2 x 300 GB 10000 rpm 2.5-Inch SAS Disks),
    Max Internal Storage: 2.4Tb (8 x 300GB 10000 rpm 2.5-Inch SAS Disks),
    Ethernet: 4 x 1 Gb 10/100/1000 MBs Integrated Ethernet Ports. Option Slot for 8 x 10 GbE XAUI Ports, 16 PCIE express module slots
    Power: 4 PSU's @ 12.6 A @ 200 V AC
    Space: 5RU, 8 systems per industry standard rack.

    You need to purchase the 10GbE adapter separately, the circuitry already exists inside the CPU but you need the physical connector to be either copper or fiber, your choice. And while the system itself is 177 grand a pop, the specialized software this is most likely running will be twice that price.

    I can't get a quote on an IBM Power 755 without contacting a sales agent, I figure it will be similar to the above SPARC range. Bonus points to the IBM for being very Linux friendly.
Display more comments