HighPoint's adapter enables GPUDirect storage — up to 64 GB/s from drive to GPU, bypassing the CPU

HighPoint
(Image credit: HighPoint)

HighPoint on Thursday introduced its Rocket 7638D PCIe 5.0 switch card that is designed to enable Nvidia's GPUDirect interconnection between AI GPUs and NVMe storage devices. The device is designed to speed up AI training and inference workloads when operating with software that fully support GPUDirect.

The latest GPUs from Nvidia (starting from A100) support GPUDirect technologies that enable direct data transfers between GPUs and other devices, such as SSD or network interfaces, bypassing the CPU and system memory to increase performance and free CPU resources for other workloads. However, GPUDirect requires support both from the GPU and from a PCIe switch that supports P2P DMA capability, but not all PCIe Gen5 switches support this feature, which is where switch cards come into play.

(Image credit: HighPoint)

The Rocker 7638D adapter enables GPUDirect Storage workflows that avoid host CPU and RAM entirely and provide predictable bandwidth (up to 64 GB/s) and latency when paired with compatible software, which includes operating system, GPU drivers, and filesystem. The device (or rather systems that it enables) is particularly useful in scenarios involving large-scale training datasets that use plenty of storage.

Since the the Broadcom PEX 89048 switch chip contains an Arm-based CPU, it is completely independent and self managed, so it is compatible with both Arm and x86 platforms. The adapter works out of the box with all major operating systems, without the need for special drivers or additional software installation.

The Rocket 7638D includes field-service features such as VPD tracking for hardware and firmware matching and a utility that monitors health status and PCIe link performance. These tools simplify troubleshooting and replacement, especially in multi-node or hyperscale installations where hardware tracking matters.

HighPoint did not disclose pricing of its Rocket 7638D switch card, which will probably depend on volumes and other factors.

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button!

TOPICS
Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • abufrejoval
    To my knowledge any GPU can bypass the CPU for storage or network data transfers, if it talks PCI and asks the OS nicely for setting up address mappings between both sides: that administrative part requires some CPU/OS support, but then they can just fire away, while arbitration will protect all parties from monopolising all bandwidth.

    And again, for all I know, CUDA supports these functionalities pretty much since it became popular for HPC, because nobody there wants to bother with CPU overheads while other GPU software stacks might (or should) for the same reason.

    Reaching back into the older crevasses of my mind, I believe the IBM XGA adapter should have been able to do the same, since it supported bus master operations. And once GPU and network/storage data are both memory mapped, it only takes software to make things happen.

    And that software support, which is really just about delegating some of the (CPU based) OS authority over the PCI(e) bus to a GPU (or any other xPU device, that might want it), isn't naturally hardware dependent (the article mentions A100), but a matter of driver and OS support to negotiate and set up that delegation: AFAIK this is rather older, but might have mostly supported network/fabric support for HPC workloads, targeting MPI transfers over Infiniband; local storage wasn't popular in HPC for a long time, because it was typically more bother than help until NV-DIMMs or really fast flash storage came along.

    In short, this isn't a HighPoint feature or a result of their work, even if HighPoint might want to create that impression. These are just basic PCIe, OS and GPU/CUDA facilities that HighPoint supports just like any other PCIe storage would. It's like a window vendor claiming to also support "extra clean air".
    Reply