Nvidia's CUDA: The End of the CPU?


Meanwhile, as CPU makers were tearing their hair out trying to find a solution to their problems, GPU makers were continuing to benefit more than ever from the advantages of Moore’s Law.

nvidia CUDA

Why weren’t they handicapped in the same way as their confreres who design CPUs? The reason is very simple: CPUs are designed to get maximum performance from a stream of instructions, which operates on diverse data (such as integers and floating-point calculations) and performs random memory accesses, branching, etc. Up to that point, architects were working to extract more parallelism of instructions – that is, to launch as many instructions as possible in parallel. Accordingly, the Pentium introduced superscalar execution, making it possible to launch two instructions per cycle under certain conditions. The Pentium Pro ushered in out-of-order execution of instructions in order to make optimum use of calculating units. The problem is that there’s a limit to the parallelism that is possible to get out of a sequential stream of instructions, and consequently, blindly increasing the number of calculating units is useless, since they remain unused most of the time.

Conversely, the operation of a GPU is sublimely simple. The job consists of taking a group of polygons, on the one hand, and generating a group of pixels on the other. The polygons and pixels are independent of each other, and so can be processed by parallel units. That means that a GPU can afford to devote a large part of its die to calculating units which, unlike those of a CPU, will actually be used.

nvidia CUDA

GPUs differ from CPUs in another way. Memory access in a GPU is extremely coherent – when a texel is read, a few cycles later the neighboring texel will be read, and when a pixel is written, a few cycles later a neighboring pixel will be written. By organizing memory intelligently, performance comes close to the theoretical bandwidth. That means that a GPU, unlike a CPU, doesn’t need an enormous cache, since its role is principally to accelerate texturing operations. A few kilobytes are all that’s needed to contain the few texels used in bilinear and trilinear filters.

nvidia CUDA

Create a new thread in the US Reviews comments forum about this subject
This thread is closed for comments
    Your comment
  • Anonymous
    CUDA software enables GPUs to do tasks normally reserved for CPUs. We look at how it works and its real and potential performance advantages.

    Nvidia's CUDA: The End of the CPU? : Read more
  • pulasky
  • Anonymous
    Well if the technology was used just to play games yes, it would be crap tech, spending billions just so we can play quake doesnt make much sense ;)
  • MTLance
    Wow a gaming GFX into a serious work horse LMAO.
  • dariushro
    The Best thing that could happen is for M$ to release an API similar to DirextX for developers. That way both ATI and NVidia can support the API.
  • dmuir
    And no mention of OpenCL? I guess there's not a lot of details about it yet, but I find it surprising that you look to M$ for a unified API (who have no plans to do so that we know of), when Apple has already announced that they'll be releasing one next year. (unless I've totally misunderstood things...)
  • neodude007
    Im not gonna bother reading this article, I just thought the title was funny seeing as how Nvidia claims CUDA in NO way replaces the CPU and that is simply not their goal.
  • LazyGarfield
    I´d like it better if DirectX wouldnt be used.

    Anyways, NV wants to sell cuda, so why would they change to DX ,-)
  • Anonymous
    I think the best way to go for MS is announce to support OpenCL like Apple. That way it will make things a lot easier for the developers and it makes MS look good to support the oen standard.
  • Shadow703793
    Mr RobotoVery interesting. I'm anxiously awaiting the RapiHD video encoder. Everyone knows how long it takes to encode a standard definition video, let alone an HD or multiple HD videos. If a 10x speedup can materialize from the CUDA API, lets just say it's more than welcome.I understand from the launch if the GTX280 and GTX260 that Nvidia has a broader outlook for the use of these GPU's. However I don't buy it fully especially when they cost so much to manufacture and use so much power. The GTX http://en.wikipedia.org/wiki/Gore-Tex 280 has been reported as using upwards of 300w. That doesn't translate to that much money in electrical bills over a span of a year but never the less it's still moving backwards. Also don't expect the GTX series to come down in price anytime soon. The 8800GTX and it's 384 Bit bus is a prime example of how much these devices cost to make. Unless CUDA becomes standardized it's just another niche product fighting against other niche products from ATI and Intel.On the other hand though, I was reading on Anand Tech that Nvidia is sticking 4 of these cards (each with 4GB RAM) in a 1U formfactor using CUDA to create ultra cheap Super Computers. For the scientific community this may be just what they're looking for. Maybe I was misled into believing that these cards were for gaming and anything else would be an added benefit. With the price and power consumption this makes much more sense now.

    Agreed. Also I predict in a few years we will have a Linux distro that will run mostly on a GPU.
  • kelfen
    Well this is a huge step, hope to see it successful.
  • LogicalError
    FYI: Apple has been working with the Khronos group (the people behind OpenGL at the moment) to make an API called OpenCL which should do all the things that Cuda et al can do. Since it's not just Apple that's behind it, but also the Khronos group, it should be cross platform. So who knows.. maybe this is going to be the unifying API for this.. well, until Microsoft comes up with 'DirectC' ofcourse
  • Anonymous
    the last page comments on how MS could come in and create a common API, this common API is already in process, its just that MS isn't part of it ;)
  • Anonymous
    I know that this is not too close to the article, but i hope that it is still not too OFF topic.
    I just have a question, and someone might answer it (the TH is full with smart guys). My problem is that there are too many misconceptions floating around in the net regarding CUDA and overall the whole GPGU businnes.
    I have seen somewhere, that these GPU's are able to do Double Precision floating point calculations, but personally i find this unlikely.
    Others say that you can take directly your parallel code writen in C or Fortran90, and adopt it to CUDA, because the standard stuff can run serial on the CPU and the most computationally expensive part parallel on the GPU. On top of that you can 'adress' or cummunicate with your GPU directly from a Fortran code with sort of system calls (i think this is BS).
    Quiet frankly, i have not found a site on which i can really rely on, where they show an example (source code and explanation) of how something like this could be done.
  • bf2gameplaya
    I wish Intel and NVidia would get over themselves and co-operate and finally give total system performance that big ass boost it needs.

    Intel is wasting time ray-tracing on a CPU and NVidia is wasting frames by folding proteins on their GPU.

    "You're doing it wrong!"
  • Anonymous
    No, the best would be if we got an open API, like OpenGL. I seriously do not want another DirectX locking me to MS >_
  • thr3ddy
    @dariushro: That would quite possibly be the worst thing that could happen to GPGPU. Microsoft equals Windows and GPGPU and super computing is not Windows' strongest point (understatement).

    It would be better for a neutral party composed of GPGPU experts from different IHVs to initiate something like what you propose, more like what the OpenGL ARB creates, a specification.

    IHVs and other companies could then implement this standard on their own hardware, thus decentralizing development from the ISV. If you leave development of this type of technology up to Microsoft (or any other single developer) you'll end up with vendor lock-in, which is a Bad Thing, for all of us.

    Anyway, CUDA is great but not cross-platform compatible (Intel, AMD/ATI, etc.) which makes it impossible to implement in commercial software, unless a CPU-bound alternative is provided, which would defeat the purpose of the architecture.

    On a similar note: think of the choice between the PhysX SDK and Havok Physics. Do you want partial GPU accelerated physics supported by one brand (PhysX, NVIDIA G80+) or do you want to stay CPU-bound but have the same feature set regardless of the hardware (Havok)?
  • magnesious
    If you had the patience to read this entire thing, I'd recommend you look at the CUDA programming guide(link) It's the same information, but less terse.

    Tom's also forgot to point out that development is possible via emulation (emuDebug build setting, I think, with the .vcproj they give you), so anyone can get their hands dirty with the API. You don't get the satisfaction of seeing cool speedups, but it's just as educational, and easier to debug. No screen flickers :)
  • MxM
    I wonder if a PC can be build today without processor at all? It probably requires different BIOS for mobo and some kind of x86 emulator for NVIDIA card, but is it possible in principle without any modifications in hardware?
  • godmodder
    The end of the CPU is nowhere near. To think the GPU could be used for every task is just absurd. The GPU is only good for tasks which can be massively parallellized. Unfortunately, not that many tasks, apart from graphical processing, can be divided into smaller, completely independent parts.