Sign in with
Sign up | Sign in
Intel's Next-Generation Server Promises
By ,
1. Can Bensley Make Up For Xeon's Flaws?

The reason why Intel wouldn't agree to a dual-core comparison test is simply a matter of processor performance. While even the upcoming dual core Xeons won't make much of a difference, the future holds much promise, as the Bensley platform will represent a significant departure from the Xeon architecture.

Although Intel's architecture still clearly dominates the x86 server market, AMD's Opteron has represented more than just an annoyance for Intel since its introduction in April 2003. In a nutshell, Opteron offers better computing and per-Watt performance at a roughly equivalent per-device price and it scales much better when moving from one to two or even four CPUs. AMD's performance lead is equally large when compared to Intel's new dual-core Xeon devices.

Intel's new Xeon dual core processor, which runs at 2.8 GHz, is based on Intel's 90-nm process and, you guessed it, offers NetBurst, including HyperThreading support. A dual-core processor with beefed up 2x 2 MB L2 cache, its performance certainly is solid enough for the target market. However, a dual-core duel, as AMD has been trying to provoke, would be a potential nightmare for the product management staff at Intel's enterprise group.

The Lindenhurst chipset family (E7520/E7320) for Xeon processors has two major handicaps. On the one hand, both dual processor CPUs have to share one front side bus and one memory array. On the other hand, Intel's registered DDR2 memory for servers does not really offer any advantages compared to DDR1 memory. Finally, why should you thus invest in a server system that will be outdated in a few months?

A server purchase decision is usually driven by reliability, availability and serviceability issues rather than maximum performance. From this point of view, AMD still has a long way to go, because the top gun OEMs such as Dell, Fujitsu-Siemens, HP/Compaq, IBM and Toshiba are to a large degree locked in with Intel.

But the ranks are growing, as AMD used to say at different occasions, and Intel must act now in order to protect its lucrative server CPU business. While salvation by means of the Woodcrest server processor is still almost a year away, the next platform-generation Bensley will knock on our doors early next year already. The good news here is that the Bensley platform will not only introduce several architectural improvements such as a new memory controller, independent buses for each CPU and nice upgrade paths. But it will also be able to run either the upcoming 65-nm dual core Xeon processor, called Dempsey, as well as the next-generation Woodcrest.


Intel's new dual core Xeon indeed offers superior performance - as long as you compare it to its single- core counterparts. However, AMD offers even faster chips both for single and dual-core applications.
2. Intel's Server Roadmap Explained

It's interesting to see what has become of Intel. While code names used to be top secret in the past, Intel increasingly communicates these to the public at rather early stages today. At the same time, the number of code names for processors and chipsets as well as the whole platforms has increased as well. The result is a server roadmap overview page (see above) that hosts over 30 code names.

We are going to take a look at the upcoming Xeon generation, which is framed with the red dashed line. Bensley is the code name for both the next-generation Xeon platform as well as the processor and is geared towards the entry-level enterprise segment. The multi- processing platform is called Truland, but due to lack of real news, we are not going to discuss it here.

The chipset for the Xeon Bensley platform will either be Green Creek (workstation) or Blackford (server) and you will have the choice of three different 65-nm processors: Dempsey, a shrunken version of today's dual core Xeon; Sossaman, which is a low-power version for small form factor servers or the upcoming Woodcrest processor. The latter will make use of Intel's new micro architecture and, as our preliminary intelligence shows, will likely assume the lead in the performance-per-watt contest against AMD.

Dempsey is the next-generation Xeon dual core, based on the technology that is going to be introduced with Cedar Mill and Presler . This chip is going to be the last NetBurst based product. Woodcrest, however, won't be available before H2/2006. This device will also be a 65-nm product, but it will be a next-generation micro architecture processor with either 4 or 8 MB L2 cache.

The neat thing about the Dempsey platform is that motherboards will be able to run either Dempsey or Woodcrest processors (Socket 771), giving users a promising upgrade option. I haven't mentioned Sossaman yet because this device is different again. Sossaman is a redesign of the Yonah notebook dual-core and specifically targets low-power server systems. However, Sossaman platforms will be quite different from systems that power Dempsey and Woodcrest because they will use Socket 479.

One thing might be important to add: The Xeon dual core you can buy today is of exclusive use for Lindenhurst platforms available today. All the upcoming dual cores are going to use the Bensley platform.

3. Multi Cores For Servers

As we already outlined in our Pentium D 900 Series preview article, the last processor generation that uses NetBurst is going 'double core' rather than dual core.

While AMD's dual core Athlon 64 X2 and Opteron processors as well as the new dual-core Xeon accommodate two processor cores on one physical die, the 65-nm NetBurst generation (the Pentium 4 6x1 family, Pentium D 900 series and Xeon 5000 series) places two single cores into one processor package.

Sossaman

Sossaman is one of the most interesting processor releases for the beginning of next year. It is based on the same silicon that Intel will use to power dual-core Yonah based notebook computers, but it is going to be validated for server use. The reason for deploying this chip into the professional space is its low thermal design power of only 31 W maximum, which has a 2 GHz clock speed.

According to some benchmark projections we have seen at an Intel server workshop, a dual 2 GHz Sossaman system will deliver roughly the same performance as a twin-CPU dual-core Xeon machine available today and will also offer lower thermal values.

4. Platform Innovations For 2006, At Last

There are several key issues that Intel wants to address with its 2006 Bensley platform, including reliability, availability and serviceability - although performance is really the first thing to really worry about and, yes, it is obviously going to increase, too.

The second item is called efficiency and utilization. Well, some efficiency improvements were overdue and utilization mainly refers to the virtualization technology that Intel is going to introduce with the release of its 65- nm processors. Intel calls it VT, while the AMD counterpart is called Pacifica for the time being. Both allow for installing a so-called Hypervisor, which is a core that extends the system by adding multiple system partitions.

In the desktop, you could install Windows XP Professional and Windows XP Media Center Edition, and have both run at the same time. The first would allow you to do office work while the second acts as media center in your living room - with only one (dual/multi core) computer.

In server environments, virtualized machines could be used to simplify clustering, to assign a 'new server' to a software development team within minutes, to move software-virtualized solutions (VMWare) one level down to hardware or simply to reduce the number of actual machines in use at a given time.

There will be many potential applications for virtualization technology that most of us haven't even considered. Just think about a TV provider that would like to have people access its network while getting rid of set-top boxes: Just install your pay TV OS...

RAS - reliability, availability and serviceability is the third item on the list. In 2006, Intel wants to add support for software RAID 6 to its platforms. In contrast to RAID 5, RAID 6 runs double redundancy to keep a hard drive array workable even if two drives should fail.

Manageability finally is going to introduce iAMT (Active Management Technology) features into the server space.

5. Today Vs. Tomorrow


The Lindenhurst platform today enables one or two processors to run over a shared 200 MHz quad-pumped system bus (FSB800). A peak bandwidth of 6.4 GB/s can be reached; however, this bandwidth is shared between the two processors.

Another bottleneck can be the DDR2-400 memory, as it offers the same gross bandwidth of 6.4 GB/s which, again, needs to be shared by both processors. In contrast, the AMD platform implements the memory controllers directly into the CPU, allowing for each processor to have its own main memory at full 6.4 GB/s DDR400.

The heavily loaded point-to-point interconnect with three end points will be replaced by Bensley's DIB design (Dual Independent Bus), finally attaching each processor with a separate Front Side Bus. Speeds will extend to 266 MHz (FSB1066), boosting the bandwidth up to 8.5 GB/s per processor.

At the same time, the memory controller is beefed up to not only support DDR2-400 dual channel operation, but DDR2-533 at quad-channel mode. Again, the total gross bandwidth increases from 6.4 GB/s to approximately 17 GB/s.

Although these changes look rather promising it remains to be seen how well they translate into real-life performance gains. It is especially difficult to estimate the efficiency of the quad-channel memory controller at this point. Having seen some early systems we would tend to say that the dual independent bus is the most important factor for speeding up the Xeon architecture.

Fully Buffered DImms Are The Achilles Heel

Indeed, we found the memory question to be a real problem. Not only are the buffer chips of FB-DIM modules running hotter than Intel expected, but we can also see customers wondering why they should yet again change their memory technology.

Lindenhurst introduced registered DDR2-400 memory with ECC, which in fact does not really provide more performance than DDR333. Yet lots of systems were shipped with DDR2 memory. However, systems that required vast amounts of memory would have to go to lower speed DDR memory anyway - leaving very little to justify DDR2.

Although FB-DIMM is definitely going to be the standard for quite some time, it remains questionable for us whether the market will be ready to go FB-DIMM when Intel is.

6. The Blackford Server Chipset (Bensley Platform)

While Blackford is the server device and will be available in a fully-featured and as well as stripped-down value version, Green Creek is a modified design that caters to the workstation market. The new platform's performance enhancements thus won't represent Intel's only hope to help stave off the AMD threat as it will address a wide variety of applications as well.

Blackford comes with three x8 PCI Express ports that can be utilized for networking or mass storage controllers or a PCI-X bridge. The Blackford VS (Value Server) will have two configurable x4 PCIe ports only and will not support quad-channel memory. There will be new ESB-2 I/O controller hubs (bridges) that come with all the necessary interfaces such as integrated SATA2 ports and USB 2.0.

The Green Creek Workstation Chipset (Glidewell Platform)

Apart from the 16 PCIe lanes that can either be configured as a single x16 port or as two x8 port for dual graphics support, there is not much of a difference between Green Creek and Blackford.

7. Quad Channel Memory Controller With FB-DImms


Blackford and Green Creek are going to introduce Fully Buffered DIMMs for the first time. A so-called buffer chip on each DIM module is used to create serial point-to-point connects between the memory controller and each of the installed memory modules. While classic memory setups will cause a high load for the memory controller with the number of memory banks installed increasing, FB-DIMMs can be lined up into a much deeper queue. As a byproduct, the motherboard routing for DB-DIMM boards is very simple.

One highly interesting feature is the mirroring option for the main memory. While the memory controller of Blackford and Green Creek support quad-channel operations, the two independent lockstep pairs allow for running a mirrored two channel setup that works like a memory RAID 1.

In addition to that, Intel runs 'posted CAS', a feature which allows for activating the column address strobe clocks ahead of when it would be initiated originally. A thermal monitoring feature is available to throttle memory activity when reaching a chip temperature threshold.

Blackford also supports DIMM sparing. This will allow administrators to install replacement DIM modules into machines running a mirrored setup. Should one of the modules ever fail, the replacement will be there to take over. However, these features as well as the option to run as much as 64 GB of RAM require a large amount of DIMM sockets - which likely won't be deployed with standard products.


8. I/O Acceleration Technology


I/O Acceleration Technology is another item that Intel finally is going to add to its server platforms. While this is a rather general term, it stands for TCP offloading and 'optimized data movement' throughout the platform - whatever this means. According to the slide above this could enable the network controller to write data directly into the main memory.

Intel refers to 'best price performance solution to layer 4/5 acceleration' which refers to the OSI model, in which layer 4 takes care of data transport (by means of TCP - the Transmission Control Protocol) and 5 accommodates the session or protocol such as http or ftp.

Network interface cards that include a hardware TCP offload engine are essential for servers that have to handle both a high CPU load and high network traffic. The TCP activity requires a lot of performance, especially with deployment of 10 Gigabit Ethernet hardware. Intel may have found a nice and rather affordable solution to share the workload between a more intelligent network chip and the multi-core CPUs that will be available.

9. Conclusion


From what Intel keeps preaching, the new micro architecture is on track to regain both the performance crown and the performance-per-watt crown from AMD at the end of 2006. Since there is not much CPU information available to assess the plausibility of these claims, the only facts we can base our conclusion on for now are the details regarding the upcoming server platform.

First of all, we have to note that the future Xeon processors based on NetBurst and the 65-nm process very likely aren't going to make much of a difference. In many industry benchmarks such as the SPEC JBB, WEB2005 or TPC-C scenarios, the Opterons are far enough away to stay ahead.

Here we recommend checking out the server product information pages of the larger OEMs- these always include the industry standard benchmark results for comparison.

However, the new platform approach clearly is targeted at making server platforms more reliable, more robust, more flexible and overall more attractive - even though the performance advantage in the DP server space could easily remain with AMD.

Bensley is going to introduce a number of features that the competitor is either delivering later (virtualization) or not delivering by itself (Active Management, I/O acceleration), but in conjunction with third-party partners. Finally there is quad-channel memory, which may not necessarily be as fast as its name portends.