The Rise of China GPU Makers: AI and Tech Sovereignty Drive New GPU Entrants

Moore Threads
(Image credit: Moore Threads)

The number of GPU startups in China is extraordinary as the country tries to gain AI prowess as well as semiconductor sovereignty, according to a new report from Jon Peddie Research. In addition, the number of GPU makers grew worldwide in recent years as demand for artificial intelligence (AI), high-performance computing (HPC), and graphics processing increased at a rather unprecedented rate. When it comes to discrete graphics for PCs, AMD and Nvidia maintain lead, whereas Intel is trying to catch up. 

18 GPU Developers

Tens of companies developed graphics cards and discrete graphics processors in the 1980s and the 1990s, but cut-throat competition for the highest performance in 3D games drove the vast majority of them out of business. By 2010, only AMD and Nvidia could offer competitive standalone GPUs for gaming and compute, whereas others focused either on integrated GPUs or GPU IP.

The mid-2010s found the number of China-based PC GPU developers increasing rapidly, fueled by the country's push for tech self-sufficiency as well as the advent of AI and HPC as high-tech megatrends. 

In total, there are 18 companies developing and producing GPUs, according to Jon Peddie Research. There are two companies that develop SoC-bound GPUs primarily with smartphones and notebooks in mind, there are six GPU IP providers, and there are 11 GPU developers focused on GPUs for PCs and datacenters, including AMD, Intel, and Nvidia, which design graphics cards that end up in our list of the best graphics cards.

In fact, if we added other China-based companies like Biren Technology and Tianshu Zhixin to the list, there would be even more GPU designers. However, Biren and Tianshu Zhixin are solely focused on AI and HPC for now, so JPR does not consider them GPU developers. 

Swipe to scroll horizontally
PCDCIPSoC
AMDBirenArmApple
BoltTianshu ZhixinDMPQualcomm
InnosiliconRow 3 - Cell 1 Imagination TechnologyRow 3 - Cell 3
IntelRow 4 - Cell 1 Think SiliconRow 4 - Cell 3
JingiaRow 5 - Cell 1 VerisiliconRow 5 - Cell 3
MetaXRow 6 - Cell 1 Xi-SiliconRow 6 - Cell 3
Moore ThreadsRow 7 - Cell 1 Row 7 - Cell 2 Row 7 - Cell 3
NvidiaRow 8 - Cell 1 Row 8 - Cell 2 Row 8 - Cell 3
SiArtRow 9 - Cell 1 Row 9 - Cell 2 Row 9 - Cell 3
XiangdixianRow 10 - Cell 1 Row 10 - Cell 2 Row 10 - Cell 3
ZhaoxinRow 11 - Cell 1 Row 11 - Cell 2 Row 11 - Cell 3

China Wants GPUs

Being the world's second largest economy, China inevitably competes against the U.S. and other well-developed countries in terms of pretty much everything, including technology. China did a lot to lure engineers from around the world and make it worthwhile to establish various chip design startups in the country. In fact, hundreds of new IC design houses emerge in China every year. They develop all kinds of things from tiny sensors to complicated communication chips, thus enabling the country's self-sufficiency from Western suppliers.  

Moore Threads

(Image credit: Moore Threads)

But to really jump on the AI and HPC bandwagon, China needs CPUs, GPUs, and special-purpose accelerators. When it comes to computing, it is impossible for Chinese companies to leave behind long-time CPU and GPU market leaders any time soon. Yet, it is arguably easier and perhaps more fruitful to develop and produce a decent GPU than try to build a competitive CPU.  

"AI training was the big motivator [for Chinese GPU companies], and avoidance of Nvidia's high prices, and (maybe mostly) China's desire for self-sufficiency," said Jon Peddie, the head of JPR. 

GPUs are inherently parallel, which means there are loads of compute units inside that can be used for redundancy, which makes it easier to get a GPU up and running (assuming that per transistor costs are relatively low and overall yields are decent). Also, since GPUs are fundamentally parallel, it is easier to parallelize them in scale-out manner. Keeping in mind that China-based SMIC does not have production nodes as advanced as those of TSMC, this way of performance scaling looks good enough. In fact, even if Chinese GPU developers lose access to TSMC's advanced nodes (N7 and below), at least some of them could still produce simpler GPU designs at SMIC and address AI/HPC and/or gaming/entertainment market. 

From China's perspective as a country, AI and HPC-capable GPUs may be arguably more important than CPUs too since AI and HPC can enable all-new applications, such as autonomous vehicles and smart cities as well as advanced conventional arms. The U.S. government of course restricts exports of supercomputer-bound CPUs and GPUs to China in a bid to slowdown or even constrain development of advanced weapons of mass destructions, but a fairly sophisticated AI-capable GPU can enable an autonomous killer drone, and drone swarms represent a formidable force, for instance. 

GPU Microarchitecture Is Relatively Easy, Hardware Design Is Expensive

Meanwhile, it should be noted that while there are a bunch of GPU developers, only two can actually build competitive discrete GPUs for PCs. That is perhaps because it is relatively easy to develop a GPU architecture, but it is truly hard to properly implement it and to design proper drivers

CPU and GPU microarchitectures are essentially at the intersection of science and art. They are sets of sophisticated algorithms that can be developed by rather small groups of engineers, but they might take years to develop, says Peddie.  

"[Microarchitectures] get done on napkins and white boards," said Peddie. "[As for costs] if it is just the architects themselves, that [team] can be as low as one person to maybe three – four. [But] architecture of any type, buildings, rocket ships, networks or processors is a complicated chess game. Trying to anticipate where the manufacturing process and standards will be five years away, where the cost-performance tradeoffs are, what features to add and what to drop or ignore is very tricky and time-consuming work. […] The architects spend a lot of time in their head running what-if scenarios — what if we made the cache 25% bigger, what if we had 6,000 FPUs, should we do a PCIe 5.0 I/O will it be out in time." 

Nvidia

(Image credit: Nvidia)

Since microarchitectures can take years to develop and they require talented designers, in a world where time-to-market is everything, many companies license an off-the-shelf microarchitecture or even a silicon-proven GPU IP from companies like Arm or Imagination Technologies. For example, Innosilicon — a contract developer of chips and physical IP — licenses GPU microarchitecture IP from Imagination for its Fantasy GPUs. There is another China-based GPU developer, which uses a PowerVR architecture from Imagination. Meanwhile, Zhaoxin uses a highly reiterated GPU microarchitecture it acquired from Via Technologies, which inherited it from S3 Graphics. 

The cost of developing a microarchitecture may vary, but it is relatively low when compared to the costs of a physical implementation of a modern high-end GPUs. 

For years, Apple and Intel, both companies with plenty of engineering talent, relied on Img for their GPU designs (Apple still does to a certain extent). MediaTek and other smaller SoC suppliers rely on Arm.  Qualcomm used ATI/AMD for an extended period, and Samsung uses AMD after several years of trying to design their own graphics engine.

Two of the new Chinese companies have hired ex AMD and Nvidia architects to start their GPU companies, and another two use Img. Time to market and learning the skills of being an architect, what to worry about, and how to find a fix is a very time consuming process.  

"If you can go to a company that already has a design and have been designing for a long time, you can save a boatload of time and money – and time to market is everything," said the head of Jon Peddie Research. "There are just so many gotchas. Not every GPU designed by AMD or Nvidia has been a winner. [But] a good design lasts a couple of generations with tweaks." 

Hardware implementation and software development are prohibitively expensive with new production nodes. International Business Times estimates that the design costs for a fairly complex device made using 5nm-class technology exceeds $540 million. These costs will triple at 3nm. 

"If you include layout and floor plan, simulation, verification, and drivers then the [GPU developer] costs and time skyrocket," explained Peddie. " The hardware design and layout is pretty straight forward: get one trace wrong and you can spend months tracking it down." 

There are just a few companies in the world that can develop a chip featuring the complexity of modern gaming or compute GPUs from AMD and Nvidia (46 billion — 80 billion transistors), yet China-based Biren could do something similar with its BR104 and BR100 devices (we speculate that the BR104 packs some 38.5 billion transistors).  

Thoughts 

Despite prohibitive costs, eight out of the 11 PC/datacenter GPU designers are from China, which speaks for itself. Perhaps we won't see a competitive discrete gaming GPU from anyone except huge American companies in the near future. That's partly because its hard and time consuming to develop a GPU, and to a large degree it requires a prohibitively expensive hardware implementation for these high-complexity GPUs. Whether or not China can field competitive entrants remains to be seen, but any failure won't stem from a lack of trying. 

Anton Shilov
Freelance News Writer

Anton Shilov is a Freelance News Writer at Tom’s Hardware US. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • bit_user
    Thanks for the table, but it would've been interesting to see which GPU makers are using which IP. As mentioned in the article, at least a couple of the Chinese companies are using Imagination's. Another useful dimension that could've been added was to indicate the location of each. For instance Think Silicon is based in Greece. Think Silicon was bought by Synopsys? And Verisilicon owns the IP formerly known as Vivante? I was surprised to see how quickly Vivante came onto the scene, about a decade ago, and then it just seemed almost to go dormant. It seemed weird that more wasn't invested in continuing to build it up.

    Regarding the section about applications, people should read "smart cities" mostly as "mass surveillance". The term means potentially more than that, but surveillance is a foundational element of smart cities, and AI-related technologies are key to that.

    while there are a bunch of GPU developers, only two can actually build competitive discrete GPUs for PCs. That is perhaps because it is relatively easy to develop a GPU architecture, but it is truly hard to properly implement it and to design proper drivers.
    I disagree. I think it's actually difficult to build competitive GPU hardware, because you need to do scaling really well, you need really good efficiency, and you need the right kinds and amounts of hardware-assist, to ensure the architecture isn't bottlenecked. Not to trivialize "drivers" (which encompasses a lot more than what people normally think of as a device driver), but the hardware aspect of GPUs isn't nearly as straight-forward as it used to be, a couple decades ago. The paragraphs following that statement seem to support me, on this point.

    Anyway, thanks for the coverage. I look forward to seeing how this race shakes out. We should probably expect to see most of these new entrants specialize or die, with the field thinning out quickly, over the next few years.

    P.S. I hope most of these new upstarts embrace OpenCL, for compute. That'd be a nice way to stick it to all the legacy players who've de-prioritized support for the standard.
    Reply
  • thisisaname
    bit_user said:
    P.S. I hope most of these new upstarts embrace OpenCL, for compute. That'd be a nice way to stick it to all the legacy players who've de-prioritized support for the standard.

    I like standards but they do have to be kept up to date, else they become a burden rather than help.
    Reply
  • jp7189
    I remember when GPUs got stuck on 28nm for a few generations. They figured out how to do some great things at 28nm... not that those designs would be top of the heap today, but a 28nm wafer scale design (a la Cerebras) should be attainable inside China and quite potent. That of course assumes you have the budget of an entire country behind you.
    Reply
  • bit_user
    jp7189 said:
    I remember when GPUs got stuck on 28nm for a few generations. They figured out how to do some great things at 28nm...
    True, though mostly by making them huge.

    Here are the 3 generations of AMD flagship GPUs that were made on 28 nm:
    Launch (YYYY-MM)
    Main Product
    Codename
    Architecture
    Area (mm^2)
    Shaders
    Core Clockspeed (MHz)
    fp32 GFLOPS
    Power (W)2012-01HD 7970TahitiGCN 1
    352
    2048
    1050
    4301
    250
    2013-11R9 290XHawaiiGCN 2
    438
    2816
    1000
    5632
    320
    2015-06R9 Fury XFijiGCN 3
    596
    4096
    1050
    8602
    275

    By using HBM, Fury was able to achieve some power savings. However, I also seem to recall the nominal 275 W was largely a fiction. I hadn't previously noticed how they stayed at ~1 GHz, but it makes sense.

    Here are the 3 generations of Nvidia flagship GPUs that were made on 28 nm:
    Launch (YYYY-MM)
    Main Product
    Codename
    Architecture
    Area (mm^2)
    Shaders
    Core Clockspeed (MHz)
    fp32 GFLOPS
    Power (W)2012-03GTX 680GK104Kepler
    294
    1536
    1058
    3090
    195
    2013-02GTX 780 TiGK110Kepler
    561
    2880
    980
    5121
    230
    2015-03GTX 980 TiGM200Maxwell
    601
    3072
    1075
    6605
    250

    Note that Maxwell introduced tiled rendering. That was largely responsible for its roughly doubling of game performance over the GTX 780 Ti, even though the raw compute didn't increase much. It enabled the GTX 980 Ti to hold its own against Fury X, which was the much stronger GPU on paper.

    jp7189 said:
    a 28nm wafer scale design (a la Cerebras) should be attainable inside China and quite potent.
    It would tie up quite a lot of production capacity.
    Reply