Huawei claims new software can 'create an analogue AI chip 1000 times faster than Nvidia’s chips' — open source Flex:ai software designed to boost AI-chip utilization
The platform promises to pool GPUs and NPUs and raise average utilization by about 30%.
Huawei has introduced Flex:ai, an open-source orchestration tool designed to raise the utilization rate of AI chips in large-scale compute clusters. Announced on Friday, November 21, the platform builds on Kubernetes and will be released through Huawei’s ModelEngine developer community. It arrives amid continued U.S.export restrictions on high-end GPU hardware and reflects a growing shift inside China toward software-side efficiency gains as a stopgap for constrained silicon supply.
Aside from being equipped to help China “...create an analogue AI chip 1000 times faster than Nvidia’s chips,” Huawei claims Flex:ai can raise average utilization by around 30%. It reportedly does this by slicing individual GPU or NPU cards into multiple virtual compute instances and orchestrating workloads across heterogeneous hardware types.
Smaller tasks that might otherwise underuse a full accelerator are stacked alongside one another, while larger models that exceed the capacity of a single device can span multiple cards. The tool includes a smart scheduler, dubbed Hi Scheduler, that redistributes idle resources across nodes in real time, automatically reassigning compute to wherever AI workloads are queued.
Flex:ai’s architecture builds on existing open-source Kubernetes foundations but extends them in ways that are still uncommon across open deployments. Kubernetes already supports device plugins to expose accelerators and schedulers, such as Volcano, or frameworks like Ray can perform fractional allocation and gang scheduling. Flex:ai appears to unify them at a higher layer while integrating support for Ascend NPUs alongside standard GPU hardware.
The launch resembles functionality offered by Run:ai, an orchestration platform acquired by Nvidia in 2024, which enables multi-tenant scheduling and workload pre-emption across large GPU clusters. Huawei’s version, at least on paper, makes similar claims but does so with a focus on open-source deployment and cross-accelerator compatibility. That may give it broader relevance in clusters built around Chinese silicon, particularly those using Ascend chips.
The open-source code has not yet been released, and Huawei has not published documentation or benchmarks. When it does become available, key questions will include the granularity of slicing, how Flex:ai interacts with standard Kubernetes schedulers, and, crucially, whether it supports widely used GPU types via standard plugins. The company has said that researchers from Shanghai Jiao Tong, Xi’an Jiaotong, and Xiamen University contributed to the tool’s development.
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.
-
redgarl And ironically, Nvidia and AMD doesn't have such tools even if AMD and OpenAI are collaborating for creating AI chips...Reply
right...
Typical Chinese propaganda...