The Power 5 is a processor, correct? Looks to me like it has multiple dies and, thusly, multiple cores. All on one big honking chip. That's my point.
No it's multiple chips containing 2 cores each on one package (ie the substrate). I'm not sure if it's 2 core 1 die per chip or 2 core 2 dies per chip, but there are two cores per piece of silicon and then the chips are put on one gigantic package/socket.
The terminology gets confusing but the Xenon is multi-core (one chip using 3 dies sharing L2 cache), the Original P4D-8series was one piece of silicon with two dies 1 core each, the P4D-9 series is 1 die 2 cores. The Power 5 is 8 core 4 chips. It has multi-core aspect but it's power is in the amount of chips place on a single package. Now this is my point, 'mutli-core' makes little sense, multi-chip will make sense and even become 'necessary to an extent' when fab processes reach limits of moore's law or at least reach diminishing returns with regards to surface area and regards to transsitor layout and density.
And when I said that multi-core technology was already in place, I was not referring to GPUs.
Ok, that's true then of course.
I still feel that multi-core processors are the wave of the future. In my opinion your outlook is overly myopic.
Never, but that's your optics on my position.
Single-core CPUs, while a viable option HERE AND NOW, will become a rare breed within 5 years.
True, but that's required a heck of a shift in both manufacturing and programming. Also remember that CPU design is extremely different than VPU design, they are already brushing up against moore's law and are hiting ridiculously faster speeds than VPUs. Also their design constraints are different, upgrading CPU easy, upgrading vpu impossible. You buy a whole new card with a vpu, you upgrade the cpu to match your motherboard with a faster CPU. And making it multi cpu requires a new, different, more expensive MoBo, upgrading the VPU you don't get any advantage anyways so who cares what the board mfr does? All those factors com into play as to the multi package versus multi core situation. Also you still haven't shown the benifit of 2 x 16 pipes versus 32pipes, let alone the limitations of both a unified shader design and the inability to process using 2 cores as if they were one. The dispatching when it's just pixel/vertex/geometry versus core a - core b plus that is a difficult task, and loops and such would create havoc for cross core communication, having parralel pipelines that share buffers and such makes far more sense. CPUs only work well with multi-core because of the massive amount of ground work before it with people like myself who've owned dual proc rigs, and even then the efficiencies are nowhere near that of the pipeline increase in a VPU.
Ten years from now they may well be nigh impossible to find. It's simply the evolution of processor technology combined with the increasing need for running multiple threads.
But CPU and VPU design/programming are going in completely opposite directions. CPUs are dividing the work into pre-determined loads, and not sharing it based on processor needs. CPU0 still doesn't do half the work while CPU1 does the other half, they send tasks off under specified guidelines. Whereas the VPU divides all it's components up and send them to wher needed, all the time. The move to unified shaders means that those multiple parts just do the task needed it's all divided up, and no need to define things anymore (not even between pixel and vertex) it simply says this is the load, act like this, next! Having multiple core would add a level of complexity where you divide the easily divisible components, and then have to recomdine them later before recombining them into a coherent image. so 2 division and 2 recombinations instead of one break up into parts, and then one recombination. I just don't see that as being more efficient regardless of how fast it can be done.
Oblivion, for instance, shows an improvement with dual-core CPUs. What exactly makes you think that a multi-core GPU could not garner comparable improvements?
Because VPUs are already far more parralel than CPUs, and turning it into essentially 'fast SLi/Xfire' at the core level will not improve performance as much as twice the pipelines or ALUs/TMUs/etc.
I believe that it will happen, you believe it will not. Time will tell.
Just like everything else time is the only arbiter, but my statements about the probability of these options are based on the current benfits and limits of design and on the direction both companies are headed in their design, whereas your basis is only on hwat worked for general purpose CPUs.
Multiple threads already there, multiple packages on a card for sure and maybe eventually looking like the Voodoo5, and more necessary as fab processes gets to be an issue IMO.
But moving from massively parallel high number shaders/pipes to seperate cores, I think it's highly unlikely, and nothing that's been said here has changed my view on why it's impractical.