People thought I was crazy for supporting the AMD Opteron once aswell, now I've got one and everyone else wants one.
'The Itanium Fanboy' - hehehe, not quite fitting of me currently as this week I've got no Intel systems at all. Nor do I personally own an Itanium 2 system. However, I do support the platform, just like some cling to the Alpha (even today). Very funny none the less...
I should print it on a shirt 'The Itanium Fanboy' with the logo and everything as a joke.
, wear it to work during dress down day, or even a small LAN event to see if I get out alive.
If something goes the way of the Sun UltraSPARC that would imply it was a success, not a failure btw.
, I am yet to see another single socket CPU which can execute 32 threads at once if you get my gist.
True what is said about Z-RAM (above) too, it is a double edged sword, it can work for Itanium, aswell as work against it. It has more potential to help Itanium than any other platform to be honest now I think about it some more. (Not that Opterons with 6 MB L2 cache per core wouldn't rock, but Itaniums with 24+ MB of shared L2 cache would rock just as hard.
)
The Opteron only has a 128 bit wide (in the respect is treats Dual Channel DDR as 128 bits wide) bus to memory, using NUMA it can have 2 or 4 such memory controllers that aggregate performance, or the (nVidia Pro 2000 series) chipset can 'interleave' the NUMA nodes (yes, that is the label Tyan used in the BIOS for it on the K8WE S2895) and only present one 'node' to the Operating System. There are pros and cons to each method.
Sun UltraSPARCs, Intel Xeons Itaniums, and other platforms don't lack the potential to do this, or something very similar, either. Sun have been doing it for years.
eg:
http://www.sun.com/servers/midrange/sunfire_e6900/specifications.jsp
(Just a random Sun Fire server, "
67.2-GB/sec. aggregate bandwidth", that looks good to me when compared to the apx 19.2 - 25.6 of a 4 socket Opteron solution). Then bring into account that SPARCs also have short pipelines and don't need huge memory performance. (unlike the Pentium 4 PreScott, which has a long pipeline and needs high performance memory subsystem to keep it 'well fed').
8 way Opterons are just 4 ways using dual-cores. The 16 way ones are just 2 x 4 way boards in one system using dual-cores with an interconnect. Check out:
http://www.tyan.com and SuperMicro, etc.
The NUMA system used on the AMD Opteron was heavily inspired by Sun Microsystems and Alpha designs, at the basic level it almost mirrors them.
Recently,
IA-64 just had another $10 billion injected into it (after this thread was started, and after my comments above), it is highly likely if Alpha, Sun & AMD (and several others) could build platforms to aggregate memory performance, then Intel surely can aswell. They've got FB-DIMMs, and possibly Registered DDR3 to lean on... You'd think they where preparing for something. [scratches beard].
We'll eventually hit a point (around 2010) where x86_64 requires greater percentage of 'CPU real estate' to be dedicated to decoding instructions (and associated) logic. IA-64 doesn't have this problem (or others, as the compiler does all the work) so more 'CPU real estate' can be dedicated to cores, cores which are smaller than x86_64 cores, thus more of them will fit inside a given area, and they'll have more room left over for cache... with the news of Z-RAM potentially being used as cache this is 'A Good Thing (tm)'.
If the current IA-64 core size grew 50% or even doubled, they could raise performance in any weak areas btw.... I don't think you realise how small these IA-64 cores (excluding cache) really are. They may even run at lower clock speeds than Athlon64 CPUs while doing more work per clock cycle. (much like the Athlon64 does compared to the Pentium 4).
You'd be able to fit between 4 and 8 (say 6) IA-64 cores within the space of a typical 32nm CPU, with enough room for a large (say 64 MB) shared cache to spare, and whatever other features (eg: improved integer performance) they want to cram into it.
Remember that 20% of the code does 80% of the work, and if you can fit 64 MB of code / instructions on the chip performance is going to scale (at least in single socket systems, until they get a nicer platform going for multi-socket IA-64 systems).
To me, IA-64 is just 'a work in progress' that Intel have managed to produce and sell as both the Itanium 1 and 2 (each with various models).
I am sure everyone here is aware how the IT market can change in '5 short years', the IT 'market' started around 1985. Sure computers are older, but the mass market, including business adoption of IT, I feel, started around 1985 with the introduction of 'high performance GUIs' [cough]. So the 'large scale' industry as a whole is only 20-21 years old. In another 5 years would be 25% older.
PS: If it catches on, and there is space in my sig, I'll update it to say 'The Itanium (& Opteron) Fanboy' !.
The main problem is software support, 'real' development studios for IA-64, and encouraging developers to port applications over to it then optimize them for the platform.
Poor example of the top of my head: There is .NET 2.0 for x64 platforms, but there is not .NET 2.0 for IA-64.... Windows XP 64-bit Edition for IA-64 (not x64 Edition) is has also faded away.
When workstations (and associated) software start disappearing for a particular platform one would wonder about its future yes.... and perhaps the return on investment (so far anyway) has not been that great.... but $10 billion is nothing to be scoffed at. Esp when most suspect that around the time 32nm - 22/23nm CPUs are released x86/x64 will start to lose it's edge, when it does AMD (for example) won't just 'invent' x86_128 to counter act it, it wouldn't work. x64 (EM64T/AMD64) will be ample memory address wise for a very long time, but its performance may stop scaling around 2010.
x86 is holding CPUs back, and making leaps in performance difficult even now. In the last 2-3 years x86/x64 hasn't shown any major improvements in performance. We need to encourage developers to make software with isolated threads for it for performance to scale. (Adding more cache to x86/x64 won't help it much, and having 4 cores per processor won't provide the 4x fold increase in performance because of the software).....
.... you may as well introduce a new architecture if software needs to be recreated, market it to the world, phase out all support for x86/x64 and leave it to 'security risks' and poor backwards compatibility, (x86/x64 isn't scaling, but for different reasons to IA-64 which can have the 'training wheels' taken off then start scaling very well), then brainwash the world into IA-64.
Everything can be brought.... for a price. Intel have a very strong influence as they have vast amounts of cash compared to their competition. They are also the only ones who can make dual IA-64 / IA-32 (x86_64 EM64T) systems to run both sets of software.... phase it out gradually pushing various 'consumer benefits' the whole time.
Most people don't know much about computers, so if Intel start a large scale campaign they'll just follow along like sheep. Just look at the old Pentium campaign, some people actually think that 'Pentium' = 'Computer' and don't ask about anything else except 'the latest Pentium'.
Intel are removing the Pentium brand and have changed their 'Intel Inside' logo, which means they want to prepare to start a campaign for a new CPU (Intel Core Duo, and Conroe, etc in Q3 2006), then 'something even better' when they feel the timing is right. When every other non-x86 based platform is dead... totally dead... Intel can leverage 'alternative Intel' platforms without worrying about 'problematic' competition or negative feedback doing the one thing they wouldn't attempt because everyone else was doing it. (ie: Their 'Not Invented Here Syndrome' won't be an issue).