Skip to main content

Google's Big Chip Unveil For Machine Learning: Tensor Processing Unit With 10x Better Efficiency (Updated)

While other companies are arguing about whether GPUs, FPGAs, or VPUs are better suited for machine learning, Google came out with the news that it has been using its own custom-built Tensor Processing Unit (TPU) for over a year, achieving a claimed 10x increase in efficiency. The comparison is likely made in relation to GPUs, which are currently the industry standard chips for machine learning.

Tensor analysis is an extension of vector calculus, which is at the basis of Google’s (recently released as open source) Tensorflow framework for machine learning.

The new Tensor Processing Units, as you might expect, are specifically designed to do only tensor calculations, which means the company can fit more transistors on the chip that do only one thing well--achieving higher efficiency than other types of chips.

This class of chips is called ASICs (application-specific integrated circuits), and they’ve been used, for instance, in wireless modems such as Nvidia’s Icera modem, and in Bitcoin mining rigs for orders of magnitude higher efficiency compared to GPUs and FPGAs.

Movidius said that Google's TPU philosophy is more in line with what it's been trying to achieve with its Myriad 2 vision processing unit (VPU) by squeezing out more ops/W than it's possible with GPUs.

"The TPU used lower precision of 8 bit and possibly lower to get similar performance to what Myriad 2 delivers today. Similar to us they optimized for use with TensorFlow," said Dr. David Moloney, Movidius' CTO."We see this announcement as highly complementary to our own efforts at Movidius. It appears Google has built a ground-up processor for running efficient neural networks in the server center, while Myriad 2 is positioned to do all the heavy lifting when machine intelligence needs to be delivered at the device level, rather than the cloud," he added.

Google claimed in a blog post that its TPU could achieve an order of magnitude better efficiency for machine learning, which would be equivalent to about seven years of progress following Moore’s Law. Google said that a TPU board could fit inside a hard disk drive slot in its data centers.

Google also unveiled for the first time that the TPUs were used not only to power its Street View product but also AlphaGo in its Go matches against Lee Sedol. When the company previously talked about AlphaGo, it only mentioned using CPUs and GPUs, although that was months before the Go matches happened. Google said the TPUs allowed AlphaGo to “think” much faster and allowed it to look further ahead between moves.

Google can now use TPUs not just to improve its own products, but it can also offer that jump in performance to machine learning customers. The company can now allegedly best the competition in this type of market, because others may still only offer GPU-based or perhaps FPGA-based machine learning services.

ASICs, in comparison to FPGAs, are hard-coded and cannot be reprogrammed in the field. This inflexibility restricts many from employing the specialized processors in large-scale deployments. However, Google indicated that it leapt from the first-tested silicon to a production environment in a mere 22 days. This incredible ramp indicates that Google has the ability to develop and deploy other optimized ASICs on an accelerated timeline in the future.

Google’s TPU could now change the landscape for machine learning, as more companies may be interested in following the same path to achieve the same kind of performance and efficiency gains. Google has also been dabbling with quantum computers, as well as the OpenPower and RISC-V chip architectures.

The Intel Xeon family powers 99 percent of the world's datacenters, and there are long-running rumors that Google is developing its own CPU to break the Intel stranglehold. The Google TPU may be a precursor of yet more to come. Google's well-publicized exploratory work with other compute platforms spurred Intel to begin offering customized Xeons for specific use cases. It will be interesting to watch the company's future efforts in designing its own chips, and the impact on the broader market.

Updated, 5/19/2016, 10:58am PT: The post was updated to add Movidius' comments on Google's TPU announcement.

Lucian Armasu is a Contributing Writer for Tom's Hardware. You can follow him at @lucian_armasu. 

Follow us on FacebookGoogle+, RSS, Twitter and YouTube.