Google's 'Cloud TPU' Does Both Training And Inference, Already 50% Faster Than Nvidia Tesla V100


At Google I/O 2017, Google announced its next-generation machine learning chip, called the “Cloud TPU.” The new TPU no longer does only inference--now it can also train neural networks.

First Gen TPU

Google created its own TPU to jump “three generations” ahead of the competition when it came to inference performance. The chip seems to have delivered, as Google published a paper last month in which it demonstrated that the TPU could be up to 30x faster than a Kepler GPU and up to 80x faster than a Haswell CPU.

The comparison wasn’t quite fair, as those chips were a little older, but more importantly, they weren’t intended for inference.


Nvidia was quick to point out that its inference-optimized Tesla P40 GPU is already twice as fast as the TPU for sub-10ms latency applications. However, the TPU was still almost twice as fast as the P40 in peak INT8 performance (90TOPS vs 48TOPS).

The P40 also achieved its performance using more than three times as much power, so this comparison wasn’t that fair, either. The bottom line is that right now it’s not easy to compare wildly different architectures to each other when it comes to machine learning tasks.

Cloud TPU Performance

In last month’s paper, Google hinted that a next-generation TPU could be significantly faster if certain modifications were made. The Cloud TPU seems to have have received some of those improvements. It’s now much faster, and it can also do floating-point computation, which means it’s suitable for training neural networks, too.

According to Google, the chip can achieve 180 teraflops of floating-point performance, which is six times more than Nvidia’s latest Tesla V100 accelerator for FP16 half-precision computation. Even when compared against Nvidia’s “Tensor Core” performance, the Cloud TPU is still 50% faster.

Google made the Cloud TPU highly scalable and noted that 64 units can be put together to form a “pod” with a total performance of 11.5 petaflops of computation for a single machine learning task.

Strangely enough, Google hasn’t given the numbers for inference performance yet, but it may reveal them in the near future. Power consumption was not revealed either, as it was for the TPU.

Cloud TPUs For Everyone

Up until now, Google has kept its TPUs to itself, likely because it was still experimental technology and the company wanted to first see how it fared in the real world. However, the company will now make the Cloud TPUs available to all of its Google Compute Engine customers. Customers will be able to mix and match Cloud TPUs with Intel CPUs, Nvidia GPUs, and the rest of its hardware infrastructure to optimize their own machine learning solutions.

It almost goes without saying that the Cloud TPUs support the TensorFlow machine learning software library, which Google open sourced in 2015.

Google will also donate access to 1,000 Cloud TPUs to top researchers under the TensorFlow Research Cloud program to see what people do with them.

Update, 5/18/17, 7:52am PT: Fixed typo.

Create a new thread in the US News comments forum about this subject
This thread is closed for comments
21 comments
    Your comment
  • redgarl
    Nvidia is selling shovel ware. They are selling an image of breakthrough while they are far from being even key players outside the GPU market.
    0
  • mavikt
    But, can it play Crysis?
    0
  • bit_user
    Anonymous said:
    Nvidia is selling shovel ware. They are selling an image of breakthrough while they are far from being even key players outside the GPU market.
    Well, they're the leading GPU vendor, and machine learning is earning them quite a bit of revenue. But this was predictable - as long as they're still building general-purpose GPUs, they're going to be at a disadvantage relative to anyone building dedicate machine learning ASICs.

    I wish we knew power dissipation, die sizes, and what fab node. I'll bet Google's new TPU has smaller die size and burns less power. From their perspective, it'd probably be too risky to build dies as huge as the GV100.

    Anonymous said:
    But, can it play Crysis?
    Well, no, it can't run Crysis. Perhaps it can play it, with a bit of practice.
    5