Nvidia's acquisition of Ageia in 2008 was a strategic move to boost the marketability of its GPU offerings. With the discontinuation of the dedicated PhyX boards, the acceleration moved to the GeForce GPU as a differentiation factor that set it apart from AMD's ATI cards.
If a PhysX game detected the presence of an Nvidia GPU, it would move the hardware physics to the video card. Without an Nvidia board, the physics would hit the CPU, which in all cases is slower than what a GPU can do.
It's expected that Nvidia would like to do everything it can to distance itself from the CPU and the GPUs of its competitors, but closer looks at the PhysX software implementation have shown that there could be some shadiness going on.
An excellent investigation by David Kanter at Real World Technologies found that Nvidia's PhysX software implementation for use by CPUs still uses x87 code, which has been deprecated by Intel in 2005 and now has been fully replaced by SSE. Intel supported SSE since 2000, and AMD implemented it in 2003.
The x87 code is slow, ugly, and remains supported on today's modern CPU solely for legacy reasons. In short, there is no technical reason for Nvidia to continue running PhysX on CPUs using such terrible software when moving to SSE would speed things considerably – unless that would make the GeForce GPGPU look less mighty compared to the CPU.
Ars Technica's Jon Stokes confronted Nvidia about deficient PhysX code and we are just as surprised as he was that Mike Skolones, product manager for PhysX, said "It's a creaky old codebase, there's no denying it."
Nvidia defends its position that much of the optimization is up to the developer, and when a game is being ported from console to PC, most of the time the PC's CPU will already run the physics better than the console counterpart. The 'already-better' performance from the port could lead developers to leave the code as-is, without pursuing further optimizations.
"It's fair to say we've got more room to improve on the CPU. But it's not fair to say, in the words of that article, that we're intentionally hobbling the CPU," Skolones said. "The game content runs better on a PC than it does on a console, and that has been good enough."
Another problem is that the current PhysX 2.7 codebase is very old; so old, in fact, that it goes back to before 2005, when x87 was deprecated. Skolones said that Nvidia is working on version 3.0 which should bring things up to date a little bit, though we don't doubt that the GPGPU functions will still be faster.
This isn't the first time that we've heard that Nvidia's PhysX software is less than well-optimized. AMD accused Nvidia of disabling multi-core support in CPU PhysX earlier this year.
seriously need better standardization and a massive spring clean with the x86 code base and all extensions since, it could bring a lot of performance improvements with very little expenditure on cpu resources much like optimizing bad code.
I really wouldn't be surprised if AMD eventually brought about and advocated the development of a new "Open-PL" physics standard or something.
Silly to say nvidia is deliberately hobbling physx on the cpu, though; given that ALL the physx code is old (by the author's admission) it's just old code. I'm an nvidia fan but the truth is that physx is so poorly adopted among game developers it wouldn't bother me in the least to just see it go away.
- Pentium Pro
- Pentium II
- early K7: Athlon (up to 1200 MHz) and Duron (up to 950 MHz)
So, at the time the software-only implementation of PhysX was written (around 2005, I'd say), there were still some SSE-less machines around.
Of course, Nvidia has no excuse about not making an SSE-optimized build until now, will x87 fallback.
I do agree with your premise, but not the language you use. Optimizing your code to run the SSE instructions can give you about 3x to 4x speedup. However, it takes effort. NVIDIA would rather spend the resources on the parts of PhysX that sell their GPUs than to maintain the CPU compatibility code.
This strategy may backfire, since they still make money by licensing the PhysX technology to game developers. If they cripple their physics engine for non-NVIDIA setups, they will lose revenue and market share to Havok. Developers want their games to work on as many configurations as possible. If they can't have what they want from NVIDIA, they will go to someone who will provide it.
But with multi-core CPUs the norm now, why would I want to dedicate part of my graphics computing power for physics? Rhetorical question; I wouldn't. I'd rather have the game written to be optimized for multiple cores, and perhaps dedicate one core to physics, if it would help. Seems silly to have CPU cores idling while the GPU does double duty.