Wave Computing Announces Early Access Program For Its Super-Fast Machine Learning Appliance

Wave Computing, a startup trying to tackle a growing and increasingly more interesting market for specialized machine learning chips, announced an Early Access Program for its new compute appliance. The company said that the appliance can speed up the training of neural networks by up to 1,000x (presumably compared to CPUs).

Wave Computing Dataflow Architecture

Wave Computing wants to do away with the traditional CPU and GPU matchup for machine learning, where the CPU manages the data flows and the GPUs act as accelerators. This solution can create some bottlenecks, which is why Wave has created a complete system in the form of the compute appliance, with potentially much better performance than the CPU/GPU solutions.


As a single unified system, Wave’s compute appliance can better exploit data and model parallelism present in deep learning models, such as convolutional and recurrent neural networks. The convolutional neural networks try to replicate how the visual cortex of an animal or human works, while the recurrent neural networks can be useful for natural language processing due to how they can “remember” previous information.

Wave’s compute appliance can deliver 2.9 PetaOps/second of performance via its 256,000 interconnected Processing Elements (PEs), 2TB of bulk memory, Hybrid Memory Cube (HMC) memory, and up to 32TB of storage.

The company said that this type of architecture is what enables high scalability and performance, which can make it useful to organizations that develop deep learning models with machine learning frameworks such as Tensorflow. This can include industries such as retail, social media, entertainment, gaming, financial services, and the automotive market.

Early Access Program

Wave’s Early Access Program will give a select number of data scientists the ability to test and work with the company’s compute appliances before they're more widely available for sale in Q4 2017.

According to Wave, one U.S. car manufacturer reported that its customers created 4.22 Petabytes (PBs) of data in 2016, and that each self-driving car should generate 2PBs of data per year by 2020. Using this much data to train neural networks could require weeks, but Wave said its appliance could shorten that to a few days.

“This is an exciting time for the machine learning industry as big data and analytics offer insight-driven enterprises the ability to change how they do business,” said Derek Meyer, CEO of Wave Computing.

“Our new Wave compute appliance offers a breakthrough for companies wanting to speed their development and deployment of machine learning applications. By providing data scientists with early access to our dataflow solution, they will have the necessary performance to train neural networks in times never before imaginable,” he added.

Although Wave’s appliance currently supports only Tensorflow, other machine learning frameworks such as Microsoft’s Cognitive Toolkit (CNTK), MXNet, and more will be supported in the future.

This thread is closed for comments
1 comment
    Your comment
  • genz
    Funny enough, this reads like the APU theory AMD was jumping on (combine CPU/GPU to work together with low latency, just like their core) then the HBM theory AMD was jumping on (give everything massive throughput to the same memory space so that they don't have to swap memory from GPU space to CPU space) then the Infinity Fabric thing AMD was jumping on (connect everything with fat pipes in a web instead of a ring bus or other bottlenecked system).

    Just an observation. Cool case too.