Do you have numbers for the small cache 1 GHz Itanium 2 ? I know the 1.5 GHz 6 MB scores 2119. If I extrapolate to 1 GHz I get 1412 for a theoretical 6 MB 1GHz model, the 1 MB model will score considerable less.
Spec.org has listed a 900MHz Itanium 2 with 1.5 MB of L3 cache. Assuming linear scalability, that puts a 1 GHz Itanium 2 with 1.5 MB of L3 cache at around 1278 for SpecFP. Compare this with the highest submission for the 1 GHz 3 MB L3 cache model of 1451 and we have roughly a 11.97% difference that the extra cache makes.
Compare even this extrapolated score with the 3.06 Xeon at 1243 and consider that the Itanium 2 consumes 50-60W vs 80-90W.
Now, lets look at Athlon FX: with just 1 MB it scores 1423.
With "just" 1 MB of full-speed, low-latency on-die L2 cache. And consider how long the chip has been released (is it even selling?) and the advantage of mass-market not to mention the profit margins gained from Opteron it has. Include that with the fact that it's on an SOI manufacturing process and consumes ~70W compared to the 50-60W UV Itanium 2.
All of a sudden, all of the advantages of x86's "mass market" is becomming less and less of an issue. The only issue I'd say, would be backwards compatibility.
And this is just SPEC FP, a bench that makes IA64 shine, with its extremely powerfull floating point performance. Most business apps, however require INT performance. Lets look at SPEC INT:
Itanium 2 1.5 GHz 6 MB:1322
Itanium 2 1 GHz extrap:881
Athlon FX 51 1 MB :1447
Most performance-dependence (i.e database services, webservers, etc.) business applications rely more on SpecInt_rate and SpecFP_rate as they usually depend more on MP scalability than on single-processor performance. While Opteron has great scalability due to its NUMA-like Hypertransport network, the implementation is completely up to the manufacturer who makes the machine and not on the processor itself. A prime example is the scaling performance of SGI's Altix compared to say, HP's SuperDome.
With lack of desktop and workstation benchmarks (with the exception of POVRay, which Itanium 2 excels in), I could only go with SpecFP scores. You can feel free to look at the TPC-C or SpecWeb_SSL results for a more realistic business-class performance measurement.
However, we were (at least that's the impression I got) talking about the consumer impact of Itanium 2. As we are speaking of replacing x86 (which is primarily the consumer and workstation market). Both of which rely much more on FP performance nowadays than integer performance.
The itanium 1.5 is roughly twice as big, an order of magnitude more expensive an yet it doesnt beat a 1 MB bread and butter desktop chip. I stand by my point: x86 sweeps the floor with "high end" 64 bit cpu's in performance /$$$ and performance /mm². Deerfield isnt changing squat in this regard; the only area where it *may* have a lead, is FP performance / Watt.
You seem to regard integer performance as "performance" and everything else as "specialized". It's the converse I think nowadays. FP performance for most of x86's main markets is the "standard".
I find your argument of only using the 1.5 GHz 6MB L3 cache Itanium 2 to be a strawman argument. You've picked probably the *worst* case performance/die and performance/watt case for Itanium 2 and the *best* case in both of those in terms of x86.
I could easily go and pick the worst case x86 (the 3.2 P4EE) and compare it to the best case Itanium 2 (1.4 GHz LV 1.5 MB L3 cache Itanium 2) and show differently.
Comparing best cases in both performance/die and performance/watt, x86 chips certainly aren't "sweeping the floor" with "high-end" RISC (well, VLIW) chips (although the 1.4 GHz LV Itanium 2 isn't exactly "high end", it's a workstation chip priced at around $1100). At least, not for the "bread and butter" desktop market (in which FP performance is crucial).
The Athlon64 FX has a die size of 192mm^2.
The 1.4 GHz Itanium 2 1.5MB L3 cache has a die size of 180 mm^2.
The Athlon64FX at 2.0 GHz has a TDP of <A HREF="http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30430.pdf" target="_new">89W</A>
The Itanium 2 1.4 GHz 1.5MB has a TDP of <A HREF="http://www.intel.com/design/itanium2/datashts/25379501.pdf" target="_new">91W</A>
Considering that the AthlonFX has 1MB of full-speed, high-bandwidth, low-latency on-die L2 cache, it's hardly as "lowly" compared to IA-64 offerings as you seem to imply through connotations.
Are IA-64 based products sweeping the floor with its x86 counterparts? No. Is it being "humiliated" like you've painted? Hardly. With the recent move to a more mature manufacturing process, IA-64 products are looking a lot better and very competitive with its x86 counterparts for the desktop/workstation market. Despite the fact that it doesn't have economy of scale and the latest manufacturing techniques (like SOI or Strained Silicon) to help it along.
The only real barrier, like I said, is backwards compatibility.
"We are Microsoft, resistance is futile." - Bill Gates, 2015.