Physics Processing Units

G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

Yeah, I don't know what to make of this thing either. Is this something
that the main CPU is not able to handle right now? If it can't then how
bad is the CPU at this sort of stuff? Does real-life physics
simulations require an analog component to this chip?

Yousuf Khan
 
G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

On 8 Mar 2005 12:57:19 -0800, "YKhan" <yjkhan@gmail.com> wrote:

>The next processor inside a PC, after the CPU & GPU, might very well be
>the PPU.
>
> Yousuf Khan
>
>Age of physics processing units dawns
>http://www.theinquirer.net/?article=21648

This got big play on slashdot

http://slashdot.org/article.pl?sid=05/03/08/1827239&tid=137

The remarks that weren't superficial were mostly ignored. There are
no details anywhere, really, worth talking about. The interview

http://www.gamers-depot.com/interviews/agiea/002.htm

at least says exactly what the PPU is supposed to do:

<quote>

Curtis: The Physics Processing Unit (PPU) is a dedicated processing
unit that was built from the ground up to accelerate the algorithms
required for physically based simulations. This includes things such
as Rigid Body Dynamics, Collision Detection, Fluid Simulation, Soft
Bodies and Fracturing of objects.

</quote>

but I didn't see any breathless claims about gigaflops, teraflops, or
precision. Too early even for speculation, except that it's hard to
see how it won't be i/o-bound.

RM
 
G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

Robert Myers wrote:
> MD-Grape is fast because it is highly specialized, see, for example,
> figure 1 of
>
> http://www.jsbi.org/journal/GIW02/GIW02P121.pdf

I figured it would be highly specialized, without even being told. I
wonder if dual-core chips doing single-precision SSE math wouldn't be
able to compete with this totally? I can't see people sliding one of
these PPU units in separately, taking up an entire slot by itself.
Perhaps if it was part of the graphics card, then it might be accepted.

Yousuf Khan
 
G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

On 9 Mar 2005 08:53:31 -0800, "YKhan" <yjkhan@gmail.com> wrote:

>Yeah, I don't know what to make of this thing either. Is this something
>that the main CPU is not able to handle right now? If it can't then how
>bad is the CPU at this sort of stuff? Does real-life physics
>simulations require an analog component to this chip?
>

The "PPU" could be little more than a chip that IBM already makes.
It's called MD-Grape:

http://www.research.ibm.com/grape/

That would do the particle trajectory stuff. Fluid mechanics can be
done with particles, but maybe they have something more elaborate in
mind. Having a PhD from a department that has a particular reputation
in fracture mechanics, I'd love to know what kind of fracture
mechanics they're intending to do.

MD-Grape is fast because it is highly specialized, see, for example,
figure 1 of

http://www.jsbi.org/journal/GIW02/GIW02P121.pdf

RM
 
G

Guest

Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

On 9 Mar 2005 11:30:30 -0800, "YKhan" <yjkhan@gmail.com> wrote:

>Robert Myers wrote:

>> MD-Grape is fast because it is highly specialized, see, for example,
>> figure 1 of
>>
>> http://www.jsbi.org/journal/GIW02/GIW02P121.pdf
>
>I figured it would be highly specialized, without even being told. I
>wonder if dual-core chips doing single-precision SSE math wouldn't be
>able to compete with this totally? I can't see people sliding one of
>these PPU units in separately, taking up an entire slot by itself.
>Perhaps if it was part of the graphics card, then it might be accepted.
>

There's specialized and there's specialized, and MD-Grape is even more
specialized than, say, a GPU. MD-Grape is just a bunch of
multiply-accumulate pipes. From a power consumption point-of-view if
nothing else, it's a specialization that's well worth doing if you
have enough particles to keep track of.

The naive particle-particle problem has N*(N-1)/2
multiply-accumulates. That's exactly the kind of
completely-predictable streaming calculation for which stream
processors are well-suited, but the natural computational geometry of
a GPU or the Cell processor (multiple stages operate on a single
stream of data) would be sub-optimal for the calculation, which
requires only a single stage.

RM