Intel Partners With Baidu on Neural Network Processor for Training
Intel announced Tuesday at the Baidu Create AI developer conference that it's working with Baidu on the development of Intel's Nervana Neural Network Processor for Training, called NNP-T.
Those keeping track will notice Intel made a slight name change to the product since it was announced as the NNP-L 1000 in 2018, codenamed Spring Crest.
The collaboration involves both hardware and software. On the software side, Naveen Rao, CVP of Intel's AI Products Group, noted that Baidu’s deep learning framework, PaddlePaddle, was the first to integrate Cascade Lake’s DL Boost, Intel’s new instructions to double or triple the performance of FP16 or INT8 AVX-512 vector code.
Intel hopes close collaboration with Baidu on its deep learning training accelerator will ensure the design remains in lock-step with customer demands.
At its launch later this year, this will mark the first dedicated accelerator that was built from the ground up for the training of neural networks, at least from one of the big vendors. NNP-T is optimized for high-bandwidth memory, high utilization and distributed workloads. Built on 16nm, it is the successor with 3-4 times the performance claimed than the Lake Crest development vehicle that the company touted was on par with Nvidia's Volta V100.
Intel also noted that Baidu uses Intel's Optane DC Persistent Memory and leverages Intel’s Software Guard Extensions (SGX) for a memory-safe Function-as-a-Service (Faas) computing framework, MesaTEE, for safety-critical applications.
Intel recently also talked about its NNP-I M.2 accelerator for deep learning inference with Sunny Cove cores.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
-
JayNor "Cascade Lake’s DL Boost, Intel’s new instructions to double or triple the performance of FP16 or INT8 AVX-512 vector code."Reply
DLBoost doesn't do anything for fp16. The Cooper Lake chip is supposed to be the one that adds the bfloat16 avx512 vector operations.
The DLBoost/VNNI AVX512 additions are all fused multiply-add for int8 or int16. The DLBoost intrinsics are all described here, pretty clearly.
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=1651,2196,2195,2197,2204,2195,2196,2197,2204,2205,2206,2213,2214,2215,2222,2223&avx512techs=AVX512_VNNI