Skip to main content

Intel Unveils Movidius Myriad X Vision Processing Unit

Movidius, now an Intel company, announced the Myriad X VPU, which the company claims is the world's first Vision Processing Unit (VPU) with a dedicated Neural Compute Engine.

Movidius aims to infuse AI capabilities into everyday devices through several of its initiatives. AI is broken up into two steps: Deep learning consists of feeding data into a machine learning algorithm so it can train itself to accomplish a task, such as identify images, words, or analyze video. This intensive process often requires hefty resources in the data center that can leverage many different forms of compute, such as GPUs, FPGAs, and ASICs. Movidius provides its Fathom Neural Compute Stick to bring limited deep learning capabilities to embedded devices.

Inference consists of using the trained application to conduct tasks based on its learnings, such as identifying objects. The new third-generation Movidius X processor aims to make inference more accessible to edge devices like drones, cameras, robots, and VR and AR devices, among others. Combining vision processing with AI capabilities should enable a vastly expanded set of applications.

The diminutive 16nm VPU (Vision Processing Unit) SoC includes vision accelerators, a Neural Compute Engine, imaging accelerators, and 16 SHAVE vector processors paired with a CPU in one heterogeneous package. The combination of units provides a total of up to 4 TOPS (Trillion Operations Per Second) of performance within a slim 1.5W power envelope.

Movidius' Neural Compute Engine is a fixed-function hardware block that accelerates Deep Neural Network (DNN) inferences to more than 1 TOPS. Movidius claims the engine provides industry-leading performance per Watt, which is critical for the low-power design. Memory throughput can be a gating factor, so Movidius employs 2.5MB of on-chip memory connected to the Intelligent Memory Fabric to provide up to 450GB/s of bandwidth. The Movidius 2 featured 2MB of on-chip memory, but we aren't privy to its memory bandwidth specifications. In either case, all data movement between components incurs higher power consumption, so additional local memory capacity helps ensure a low power envelope.

The Myriad X is available in two chip packages that provide different memory accommodations. The MA2085 comes without on-package memory but exposes an interface for external memory. The MA2485 supports 4Gbit of in-package LPDDR4 memory, which is a notable step up from the LPDDR3 supported on the previous-generation model.

Compared to its predecessor, Movidius claims the improved memory subsystem powers up to 10X the peak floating-point computational throughput when the processor is running multiple neural networks.

Twenty hardware-accelerated Enhanced Vision Accelerators decouple some processes, such as optical flow and stereo depth, from the primary compute engine. The stereo depth block can simultaneously process dual 720p camera inputs at up to 180Hz, or six 720p inputs at 60Hz.

Movidius 2 featured 12 SHAVE (Streaming Hybrid Architecture Vector Engine) cores optimized for computer vision workloads, which increases to 16 cores with the Myriad X. The processor supports up to eight HD camera inputs and can process up to 700 million pixels per second. Sixteen programmable 128-bit VLIW Vector processors complement the other units to boost image signal processing performance.

The SoC employs 16 MIPI lanes that support up to eight HD RGB sensors. Hardware encoders support 4k resolutions at 30Hz (H.264/H.265) and 60Hz (M/JPEG). Other connectivity options include the ubiquitous PCIe 3.0 and USB 3.1 interfaces. The PCIe interface allows vendors to incorporate several VPUs in a single device.

Movidius provides the Myriad Development Kit (MDK) for programming. It includes the requisite dev tools, frameworks, and APIs.

The Movidius X joins Intel's growing arsenal of AI-centric solutions, such as Xilinx FPGAs, Nervana ASICs, and Xeon Phi products. Movidius hasn't shared pricing information, but we'll update as more information becomes available. 

  • derekullo
    How's a Terminator going to terminate without his VPU?
    Reply
  • jasonelmore
    It's interesting Intel is using TSMC to make these chips
    Reply
  • Tiya__
    It would be interesting to see how many of these cards could be upgraded
    Reply
  • Tiya__
    It's interesting Intel is using TSMC to make these chips, this is a qa test
    Reply
  • bit_user
    20115116 said:
    It's interesting Intel is using TSMC to make these chips
    Most likely due to contracts and work that was underway prior to the acquisition. It takes years to build chips.

    My knowledge is a bit dated, but typically when you switch foundries, you need to switch ASIC libraries and potentially even toolchains.
    Reply
  • bit_user
    20116512 said:
    It's interesting Intel is using TSMC to make these chips, this is a qa test
    I appreciate if the site is trying to QA their forum software, but please don't just copy and paste bits from other comments. Better to post some Lorem?rel=ugc]https://en.wikipedia.org/wiki/Lorem_ipsum#Example_text]Lorem ipsum, "Testing testing 123", "This is a test post", etc.
    Reply
  • WINTERLORD
    wont be long an a asic for etheruem will be avialble with this tech
    Reply
  • velocityg4
    Neural computing, Deep Learning, AI, Edge Devices, Vision Processing

    Gee Willikers! Can't they cram anymore trendy tech buzzwords in there.
    Reply
  • CAlbertson
    I notice the article say "4 TOPS" and not "4 FLOPS" so we have to assume these are integer operations. Even worse I assume they are 8-bit integers. This will be OK for deployment of retrained networks to consumer devices like cars and maybe even security cameras but can you train with only 8 bit integers?

    Also unless they integrate this with Tensorflow or another popular framework I don't see people wanting to code to bare metal.

    My advice is to hold onto your Nvidia Titan GPU cards for now
    Reply
  • bit_user
    20140588 said:
    can you train with only 8 bit integers?
    It's a 1.5 W part. So, of course it's not designed to replace your Titan Xp for offline training.

    20140588 said:
    Also unless they integrate this with Tensorflow or another popular framework I don't see people wanting to code to bare metal.
    From https://www.movidius.com/myriadx :
    Rapidly port and deploy neural networks in Caffe and Tensorflow formats
    Although, I wish they natively supported Caffe, rather than just supporting its models.
    Reply