Fujitsu Creates Highly Parallel Deep Learning Software For GPU Networks

Fujitsu announced that it has developed a new deep learning technology that can significantly increase the efficiency of deep learning on highly parallel GPU-based systems.

The Problem With Deep Learning On Multiple GPUs

Over the past few years, there has been an explosion of interest in deep learning as a better way to train machines to do certain tasks. Because of this, GPUs, which are well suited for processing many pieces of similar data simultaneously, have also become the centerpiece technology for deep learning development.

However, even with GPUs, it still takes too much time to create a new algorithm with deep learning out of large amounts of data. One of the issues is that deep learning and other GPU-related operations don’t scale that well over multiple GPUs.

The conventional method for accelerating deep learning that would normally be done on a single GPU is to link multiple computers in parallel and share the data across them. However, as more machines are added into the mix, it becomes progressively harder to share data between the machines, and the total performance starts seeing diminishing returns.

Fujitsu’s Efficient Data Sharing Across Multiple Machines

Fujitsu Laboratories said it has created new software technology that can more efficiently share the data between computers and applied it to Caffe, a popular open source deep learning framework.

Fujitsu’s new technology can automatically control the priority order for data transmission. This technique is used to send the data that is needed for the next learning process ahead of time to all the machines. This way, some of the delay that existed in alternative solutions is removed, and the operations can be performed in a shorter amount of time.

Fujitsu also improved how its software deals with various data sizes, by automatically applying the optimal operational method and thus minimizing the total operation time.

Fujitsu Tests Its New Technology

The company then tested it on AlexNet, an image classification neural network. Fujitsu’s technology managed to achieve a performance 14.7x that of a single GPU when using 16 GPUs, and a 27x improvement when it used 64 GPUs.

No software is perfectly parallel today, which is why we’re still seeing “only” a 27x improvement when using 64 GPUs, instead of a 64x improvement. However, the performance of Fujitsu software still showed an improvement in learning speeds of 46% for 16 GPUs and 71% for 64 GPUs, compared to more conventional software.

The more efficient software can help academics, governments and other companies significantly shorten the time it takes for them train a deep learning algorithm for various research purposes and product development.

Fujitsu aims to start commercializing this technology as part of Fujitsu’s AI technology, Human Centric AI Zinrai, by the end of 2016.

This thread is closed for comments
    Your comment
  • cats_Paw
    But learning what?
  • virtualban
    Learning the fractal base of reality, and migrating over to the mathematical universe, like Her.
    Maybe into No Man's Sky :D
  • anbello262
    I'm really interested in hearing a bit more of your logic/beliefs, virtualban. Have seen you in several threads, and don't really like what you say, but that might change if you explain yourself a bit more.
    Yours could end up being some very interesting words, so maybe you could ellaborate a bit more? I know a bit about what you say about fractals, now I'm interested in 'Her'.

    Also, about deep learning, I'm really looking forward to an AI program that can reliably separate different tracks/instruments in music/audio. The ultimate karaoke program, let's say. A program that can extract the voice (even better if also understand/subtitle the lyrics), and give you 2 separate tracks (the instrumental and the 'a capella'), without all the artifacts and distortion current 'normal algorithms' prorams cause.