Sign in with
Sign up | Sign in

Nvidia: Moore's Law is Dead, Multi-core Not Future

By - Source: Tom's Hardware US | B 104 comments

Nvidia chief scientist says that everyone needs to rethink processors in order for Moore's Law to continue.

Bill Dally, the chief scientist and senior vice president of research at Nvidia, wrote an article for Forbes purporting that Moore's Law, the theory that transistor count and performance would double every 18 months, is dead.

The problem, according to Dally's paper on Forbes, is that current CPU architectures are still serial processors, while he believes that the future is in parallel processing. He gives the example of reading an essay, where a single reader can only read one word at a time – but having a group of readers assigned to a paragraph each would greatly accelerate the process.

"To continue scaling computer performance, it is essential that we build parallel machines using cores optimized for energy efficiency, not serial performance. Building a parallel computer by connecting two to 12 conventional CPUs optimized for serial performance, an approach often called multi-core, will not work," he wrote. "This approach is analogous to trying to build an airplane by putting wings on a train. Conventional serial CPUs are simply too heavy (consume too much energy per instruction) to fly on parallel programs and to continue historic scaling of performance."

"Going forward, the critical need is to build energy-efficient parallel computers, sometimes called throughput computers, in which many processing cores, each optimized for efficiency, not serial speed, work together on the solution of a problem. A fundamental advantage of parallel computers is that they efficiently turn more transistors into more performance," Dally added.

Dally also posed that focusing on parallel computing architectures will help resurrect Moore's law, "Doubling the number of processors causes many programs to go twice as fast. In contrast, doubling the number of transistors in a serial CPU results in a very modest increase in performance--at a tremendous expense in energy."

One big driver of the current processor design are the programs written to run on current chips. Dally said that the long-standing, 40-year-old serial programming practices are ones that will be hard to change, and that programmers trained in parallel programming are scarce.

"The computing industry must seize this opportunity and avoid stagnation, by focusing software development and training on throughput computers - not on multi-core CPUs," Dally concluded. "Let's enable the future of computing to fly--not rumble along on trains with wings."

Discuss
Ask a Category Expert

Create a new thread in the News comments forum about this subject

Example: Notebook, Android, SSD hard drive

This thread is closed for comments
Top Comments
  • 32 Hide
    matt_b , May 5, 2010 3:48 PM
    Quote:
    and that programmers trained in parallel programming are scarce.

    I totally agree with this statement here. However, if this were to change, and more were trained in how to properly program for parallel computing, then the same could be said about the need to train more on how to properly program for serial/series computing - which is where we are currently in processor design. I think it's more fair to say the insufficiency lies on both sides.

    On another note, am I the only one finding it amusing that the chief scientist of R&D at Nvidia is stating the CPU consumes too much energy??? Did he forget about the monster they just released, or does he still consider it to be within acceptable power requirements or efficient enough?
  • 23 Hide
    mindless728 , May 5, 2010 3:48 PM
    except without serial optimizations general apps (not compute apps) will suffer since the serial optimizations allow for fast comparisons where as their compute cores on the GPU are very inefficient at this. Yes it will help computing, but general apps will suffer.

    And really, some programs (algorithms) can never turned into a parallel app
  • 17 Hide
    yay , May 5, 2010 3:39 PM
    Thats all well and good, until you need to do one thing BEFORE another, like when rendering a scene. Or maybe he forgot that.
Other Comments
    Display all 104 comments.
  • -9 Hide
    figgus , May 5, 2010 3:37 PM
    Translation:

    "Our tech is the future, everyone else has no idea what they are doing. Please buy our GPGPU crap, even though it is inferior to what our competitors are making right now for everyday use."
  • 6 Hide
    2zao , May 5, 2010 3:38 PM
    !!
    Maybe this will open some eyes.... but i doubt many for now....

    too many are in a stupor doing things they way that is the norm and easiest for them instead of how should be... how long will it take for people to wake up to the direction change needs to go?
  • 17 Hide
    yay , May 5, 2010 3:39 PM
    Thats all well and good, until you need to do one thing BEFORE another, like when rendering a scene. Or maybe he forgot that.
  • 11 Hide
    eyemaster , May 5, 2010 3:48 PM
    What are you waiting for then, Bill Dally, go ahead and create that chip... ...ha, that's what I thought, even you can't do it.
  • 23 Hide
    mindless728 , May 5, 2010 3:48 PM
    except without serial optimizations general apps (not compute apps) will suffer since the serial optimizations allow for fast comparisons where as their compute cores on the GPU are very inefficient at this. Yes it will help computing, but general apps will suffer.

    And really, some programs (algorithms) can never turned into a parallel app
  • 32 Hide
    matt_b , May 5, 2010 3:48 PM
    Quote:
    and that programmers trained in parallel programming are scarce.

    I totally agree with this statement here. However, if this were to change, and more were trained in how to properly program for parallel computing, then the same could be said about the need to train more on how to properly program for serial/series computing - which is where we are currently in processor design. I think it's more fair to say the insufficiency lies on both sides.

    On another note, am I the only one finding it amusing that the chief scientist of R&D at Nvidia is stating the CPU consumes too much energy??? Did he forget about the monster they just released, or does he still consider it to be within acceptable power requirements or efficient enough?
  • 8 Hide
    ravewulf , May 5, 2010 3:52 PM
    Quote:
    Dally said that the long-standing, 40-year-old serial programming practices are ones that will be hard to change, and that programmers trained in parallel programming are scarce.


    This. Extremely few programs today even properly use the limited amount of cores we have now. Look at all the programs that are still single threaded that could easily benefit from parallelism (QuickTime and iTunes for one). There are also other algorithms that simply CAN"T be made parallel (some parts of video encoding that depend on previous results for the next task).
  • 5 Hide
    triculious , May 5, 2010 3:54 PM
    while I agree that there are instances where parallel processing works way better than serialized, you can't altogether switch from one to the other
    then there's parallelized code, which is hell for programmers

    and then there's what we could call "Dally's law": your graphics card must be twice as hot every 12 months
  • 4 Hide
    rhino13 , May 5, 2010 3:55 PM
    Wait, so then you'd have a bunch of people who only understood one paragraph and nothing else? It's all gotta go back to serial at some point! This is a bad example.

    But I do agree with what he's saying. We need to put more effort into parallel speed than serial.
  • 9 Hide
    dman3k , May 5, 2010 3:56 PM
    What he says is true. It's the programmers' fault for not using more parallel programming. But unfortunately, there's only so many things that you can parallel.

    His reading an essay analogy is the perfect example of that. People have to read one word at the time. Not getting a bunch of people to read a few words and be done, because that would make no sense at all.
  • 9 Hide
    MxM , May 5, 2010 3:59 PM
    The Moor's law is only about # of transistors. It is irrelevant how many processors are build with this transistors. It does not say that # of transistors is proportional to performance or MHz speed or anything like that. I find the reference to Moor's law in NVIDIA paper just a marketing trick to promote their architecture and SCUDA. What they discussing has nothing to do with Moor's law, quite the contrary, it is how to get better performance from the same amount of elements.
  • 9 Hide
    killerclick , May 5, 2010 4:01 PM
    Why not take both approaches?
  • -1 Hide
    nevertell , May 5, 2010 4:05 PM
    The problem now is the x86 and the userbase that depends on it. What we need is a new mainstream architecture, that emulates x86 and which does parallel stuff really fast.
  • 5 Hide
    peanutsrevenge , May 5, 2010 4:05 PM
    This touches on something I've been saying for a while, a MASSIVE change is required, but it needs hardware and software developers to work together and change together, one cannot change without the other!
  • 2 Hide
    daworstplaya , May 5, 2010 4:18 PM
    rhino13Wait, so then you'd have a bunch of people who only understood one paragraph and nothing else? It's all gotta go back to serial at some point! This is a bad example.But I do agree with what he's saying. We need to put more effort into parallel speed than serial.


    Thats a bad example, think of it as doing video encoding. If you have one core doing all the work. Then that core has to do every frame line by line before it can move to the next frame. But if you had 4 cores, you could divide each frame into 4 parts and each core could work on their part before moving to the next frame. Obviously there would need to be a controller that kept all the cores in sync and combined each part of the frame back to a whole so that it would make sense to the end user. However even with that overhead it would still be much faster.

    dman3kWhat he says is true. It's the programmers' fault for not using more parallel programming. But unfortunately, there's only so many things that you can parallel.His reading an essay analogy is the perfect example of that. People have to read one word at the time. Not getting a bunch of people to read a few words and be done, because that would make no sense at all.


    Human understanding and core thread execution are 2 different things and I don't think you can use that analogy when trying to differentiate parallel vs serial processing. Even if it doesn't make sense to each individual core that is only reading a small portion,it will make sense when it put back together in the end by the controller, to the end user that ultimately reads the document.
  • 2 Hide
    Trueno07 , May 5, 2010 4:20 PM
    This change isn't something that consumers will clamor for, or something that software companies will push. A revolution in technology, whatever it may be, will be needed to bring this type of computing to the mainstream. Whether it be a parallel programing language, or some type of chip, something will have to be released so that everyone will look at parallel processing and say "That's the future, right there".
  • 8 Hide
    etrom , May 5, 2010 4:23 PM
    Quote:
    Dally said that the long-standing, 40-year-old serial programming practices are ones that will be hard to change, and that programmers trained in parallel programming are scarce.


    Ok, how about turning the billions of working COBOL lines of code running in mainframes of huge companies into parallel computing? You, Mr. Dally, do you accept this challenge?

    Nvidia is becoming a huge biased company, spreading the lobby of parallel computing to everyone. We are still waiting for something real (and useful) to get impressed.
  • 2 Hide
    jenesuispasbavard , May 5, 2010 4:32 PM
    I've considered parallelising numerical integration, and I for one think it is impossible. You NEED the results of the previous step in order to process the next step. Parallel execution at different time steps is impossible.

    Serial processors still have their uses, and in applications like this, they're so much faster than one CUDA core, say (which is all I'll be able to use).
  • 3 Hide
    mindless728 , May 5, 2010 4:51 PM
    its not that a lot of programmers are not rained in parallel apps, its just not easy to do. The easiest apps to make parallel are mathematical algorithms, though any data shared has to have a lock so multiple threads can't access it the same time, and this really hurts performance
Display more comments