AMD's New K6-2 Processor

Intel's Dominance

Last year in April AMD was able to offer the fastest Windows x86 CPU for a very short period until Intel released the Pentium II processor. Today, many people have already forgotten about this, simply being used to Intel as the all-time high end CPU supplier. Not only are Pentium II processors ruling the high end market but drops in prices have enabled the Pentium II to becoming more and more affordable. Intel decided to also get into the low end market and released the Celeron CPU, which is nothing other than a Pentium II without L2 cache. At the same time, the Pentium II at 100 MHz front side bus was also launched, assuring Intel’s lead in the upper class systems sector. It won’t be long before Intel comes out with the Pentium II Xeon CPU, which will also use the Pentium II core but with a second level cache that is running at the processor clock speed. Today you can look into almost any PC market segment and Intel is pretty much dominant. The alternative CPUs are becoming less and less popular.

Office Application Performance Becoming Less Significant In Favor Of 3D Performance

Times are not only changing in terms of Intel’s dominance. The way of how the performance of a processor has to be evaluated has changed a lot too. Whilst everyone was used to using office applications for system performance measuring in the last 5 years, nowadays there is a trend to veer away from these old trails. It’s not that people wouldn’t use office applications anymore, in the business sector as well as in SOHO people are still using a lot of or even mainly office apps. However, system performance has today reached a level where the user isn’t waiting for Winword or Excel anymore, but Winword or Excel are most of the time waiting for the user instead. This lead to the funny expression of ’Winstone is today measuring how fast the system is waiting for the user’.

Office application performance is still measurable and there are certainly still differences, but it’s really questionable how important office application performance is nowadays. Particularly in the lower end SOHO sector, people don’t really care about how fast their Winword is running. What is of prime importance today is getting more and more important today are the eye candy joys brought about by 3D gaming. A lot of the old fashioned computer journalists are going on about how bad the Celeron processor is, completely missing the point that nobody cares how fast it runs Winword, as long as it runs it as fast as a Pentium MMX 233. What matters instead is its 3D gaming performance and, surprise surprise, it’s performing very well in this field, making this CPU a lot better than what many publications have cited.

Floating Point SIMD Vs. Brute FPU Power

AMD saw this development taking place already a year ago, when they decided to improve the K6 CPU by specifically increasing the 3D performance. Whilst Pentium II CPUs are taking their great 3D performance from their brute FPU power, AMD decided to go a more elegant way of approaching 3D performance. The FPU of a CPU can do an amazing amount of complicated floating point calculations, but for 3D games only some of the FPU calculations are needed. Picking these special ’3D’ calculations and enabling the CPU to do them on several single numbers at the same time was what AMD did. Grabbing and then processing several data packets at the same time is called ’SIMD’ or ’single instruction multiple data’. This does not say that only one instruction is needed to work on multiple data, but this means that you do this instruction on multiple data of the same sort at the same time. 3D processing and rendering is using an incredible number of matrix operations. Huge amounts of data has to be processed all the same, usually done one after the other. SIMD can improve this significantly, because grabbing for example, four words and processing them at the same time is obviously faster than grabbing one word four times.

The first time SIMD was implemented into a x86 CPU was when the Pentium w/MMX was released. Intel did a lot of work convincing us that MMX would accelerate any kind of multi media, including 3D. Today Intel admits that MMX is mainly good for image processing, MMX2 or ’KNI’=’Katmai New Instructions’ is supposed to change that significantly though. The difference of the K6-2’s new instruction set ’3DNow !’ and what we know from MMx already is that ’3DNow !’ as well as KNI are able to do SIMD with floating point numbers (MMX could do this only with integers). Here’s where the 3D acceleration takes place.

Problem No.1 - New Instructions Require New Software

It took a long while for MMX software to materialise after Intel had released the Pentium w/MMX and it seems as if this was a painful experience for them. AMD is facing a similar problem with its 3DNow !. However their situation doesn’t seem to be quite so dire. Whilst MMX software wasn’t necessarily exciting, 3D games can easily amaze people, so that the demand will be higher, thus pushing game developers as well as 3D chip manufacturers into using 3DNow ! features as long as Intel will let them.

If anyone wants to take advantage of the new K6-2 and 3DNow !, there are three possible options on how that can be done. Either the 3D game is taking advantage of DirectX 6 by using the geometry engine of Direct3D 6, or the game has got its own geometry engine which is using 3DNow ! directly. Games that are only written for DirectX 5 or which don’t use 3DNow ! in their own engine will show only small or no improvements at all with the K6-2. The third option is a 3D chip/card driver that is optimized for 3DNow !. NVIDIA is the first 3D chip manufacturer who supplies a special driver for 3DNow !. It would be sensible to assume that AMD prefers game developers to use 3DNow ! directly in their games, but if this should not be an option, it’s still of advantage if the game is at least programmed for DirectX 6. It will be up to us consumers to push 3D chip makers into providing drivers that are optimized for 3DNow !.

Problem No. 2 - The Future Of Socket 7

On April 15, 1998 Intel had quite a memorable day. Not only were the next generation of Pentium II CPUs with 100 MHz FSB released, but there was also the release of the Celeron CPU, which was targeted at the low end market segment. Both CPUs are use Slot 1 instead of good old Socket 7. Intel wants Socket 7 to die a quick and painful death and AMD will have a rough job keeping people on this platform. It is certainly not easy to say what is going to happen with Socket 7, but it’s difficult to overlook meaning that Slot 1 looks the more future proof route to take right now. The K6-2 has to be a good enough product to convince people to stick to the Socket 7 platform. I doubt that any Slot 1 system owner will go back to Socket 7 however.

Office Application Performance

The 3D performance may be becoming more and more important, but the business application performance shouldn’t be ignored either. Although the K6-2 has got a new chip design, the integer part of the CPU is pretty much still the same as found in the K6. It was only optimized for the 100 MHz front side bus clock, mainly assuring a more stable timing at this speed. The K6 was already able to do the 100 MHz FSB, but AMD is not officially supporting this, due to above timing issues. More conservative timing slows down the CPU by a very small amount, which is why a K6 at 300/100 MHz is running business apps about 1% faster than the K6-2.

Compared to the Intel sixth generation CPUs the K6 was already running relatively better under Windows 95 than under Windows NT, and this hasn’t changed with the K6-2. However, you may remember how much performance increase the K6 gets out of the 100 MHz system bus, compared to the 66 MHz bus. Thus the K6-2 is now defintely the fastest Socket 7 CPU for Socket 7 in office applications, far ahead of the Cyrix ’M2’ and the IBM 6x86MX as well as the Pentium MMX to boot.

The Pentium II, especially the new 100 MHz FSB versions at 350 and 400 MHz core clock, is still holding the office application performance crown and this crown will only go over to the Pentium II Xeon processors with their L2 cache running at CPU clock once they’re released at the end of June. However, the K6-2 stands up pretty well to its Pentium II competitors at the same clock speed. Remember that the Pentium II is unlike the K6-2, hardly getting any benefit out of the higher front side bus clock, which is why we can expect the K6-2 at 350 and 400 MHz being around the performance of the PII 333 and 350, once AMD releases these versions later on this year.

Under Windows 95 the K6-2 300/100 is slightly slower than the Pentium II 300 and the K6-2 333/95 is slightly slower than the Pentium II 333. The performance of the K6-2 in office applications can still raise with better motherboards and larger L2 caches. The test system was only using 512 kB L2 cache, 1 MB and even 2 MB are possible as well though and it will improve the speed. The VIA mVP3 chipset looks pretty promising too, possibly scoring higher than ALI’s Aladdin V.

Windows NT is the domain of Intel’s sixth generation CPUs, so that the K6-2 300/100 and 333/95 can here only score somewhere in between the results of the PII 266 and PII 300. The Celeron overclocked to 400/100 MHz is slightly slower than these CPUs under Windows 95, under Windows NT it’s a tad faster than both of them though. We should also not forget that you can run multiple CPU systems with Pentium II CPUs, accelerating professional software at a very significant level. AMD’s CPUs cannot do that.

Running the high end application Winstone 98 shows the superiority of Intel’s sixth generation processors even more. The K6-2 300/100 as well as the 333/95 version are both in between the Pentium II 233 and 266, and the Celeron overclocked to 300/100 is in the same area too.

All in all the K6-2 is offering a new office application performance push for Socket 7 and as long as you are using Windows 95 with normal business apps the performance is impressively close to what a Pentium II at the same clock speed is able to do.

Benchmark Setup

Socket 7 System :

  • Microstar MS-5169 motherboard w/512 kB L2 cache (ALI Aladdin V Chipset rev. C)
  • 64 MB Corsair PC100 SDRAM
  • IBM DGVS 09U Ultra wide SCSI hard drive
  • Adaptec 2940UW SCSI host adapter
  • Diamond Viper V330 AGP graphics card, NVIDIA reference driver 4.10.01.0250
  • Resolution 1024x768
  • Color depth 16 bit
  • Refresh rate 85 Hz

Slot 1 System :

  • Asus P2B motherboard (Intel 440BX chipset, final revision)
  • 64 MB Corsair PC100 SDRAM
  • IBM DGVS 09U Ultra wide SCSI hard drive
  • Adaptec 2940UW SCSI host adapter
  • Diamond Viper V330 AGP graphics card, NVIDIA reference driver 4.10.01.0250
  • Resolution 1024x768
  • Color depth 16 bit
  • Refresh rate 85 Hz