Google Deploys TPU To Google Cloud Service

Google announced that it has released the Cloud TPU onto the Google Cloud Platform. The deployment is still in beta, but customers of Google’s cloud computing service can now use the company’s dedicated AI processing hardware for their own machine learning developments.

Designed to deliver maximum performance for machine learning-related workloads, Google’s Cloud TPU can be used for both AI inference and training. The machines can run code built with Google’s TensorFlow framework, an open-source software library of high-level APIs that allow program workloads to be easily split across one or hundreds of TPUs. According to Google, TensorFlow implementations of popular machine learning algorithms can be dramatically sped up without code changes simply by allocating more TPUs towards them.

What’s more, Google’s TPUs are available for virtual machine access. This means customers can have exclusive use of a TPU for real-time development, instead of only being able to submit jobs to a TPU compute farm.

Adding specialized machine learning resources to its cloud computing service differentiates Google’s cloud computer service from those of Amazon and Microsoft. Customers are looking for the cheapest and fastest way to execute their workloads. If TPUs can allow fewer machines to be used for less time than if said workloads were executed on traditional CPUs or GPUs, then it translates to lower costs for the customer. Access to Cloud TPUs start at the price of $6.5 / Cloud TPU / hour. A single Cloud TPU offers up to 180 teraflops of floating-point computation power.

Google announced its “Cloud” TPU in mid 2017. The impetus behind what the company calls its tensor processing unit was the unsuitability of conventional CPUs for the computing requirements of machine learning. The TPU began life as a custom ASIC designed for massive 8-bit integer computation power with low power. Integer math limited its use to inference computation only, however. For machine learning, inference is the recognition part of an AI. For an image recognition AI, inference would be what occurs when it determines whether or not a picture resembles what it’s looking for.

With the second generation of the TPU, Google added floating-point computation ability, which allowed the chip to be used for training computation as well. Training is the more computationally intensive process of building an AI that can eventually reliably recognize what it needs to. Going with the same image recognition example, training would be the process feeding numerous pictures to the AI and programming it to look for specific similarities between them that indicate a correct picture. To create the Cloud TPU, four second-generation TPU chips along with 64GB of high bandwidth memory packed onto one server board. Google connects multiple Cloud TPUs together on a rack to form a TPU “Pod.”

  • bit_user
    A single Cloud TPU offers up to 180 teraflops of floating-point computation power.
    You quoted the wrong article. This (,35370.html) dives into considerably more depth, revealing that each TPU2 only delivers 45 TFLOPS. However, they can be arranged in quad-processor configurations, with each board providing 180 TFLOPS.

    Access to Cloud TPUs start at the price of $6.5 / Cloud TPU / hour
    I think this is about twice what you can find for a single Nvidia V100, which is also faster (110 TFLOPS/GPU).