Microsoft introduces newest in-house AI chip — Maia 200 is faster than other bespoke Nvidia competitors, built on TSMC 3nm with 216GB of HBM3e

Microsoft's new Maia 200 AI data center GPU
(Image credit: Microsoft)

Microsoft has introduced its newest AI accelerator, the Microsoft Azure Maia 200. The new in-house AI chip is the next generation of Microsoft's Maia GPU line, a server chip designed for inferencing AI models with ludicrous speeds and feeds to outperform the custom offerings from hyperscaler competitors Amazon and Google.

Maia 200 is labelled Microsoft's "most efficient inference system" ever deployed, with all of its press releases splitting time between praising its big performance numbers and stressing Microsoft's lip service to environmentalism. Microsoft claims the Maia 200 gives 30% more performance per dollar than the first-gen Maia 100, an impressive feat considering the new chip also technically advertizes a 50% higher TDP than its predecessor.

Swipe to scroll horizontally
Maia 200 vs Amazon Trainium3 vs Nvidia Blackwell B300 Ultra
Row 0 - Cell 0

Azure Maia 200

AWS Trainium3

Nvidia Blackwell B300 Ultra

Process technology

N3P

N3P

4NP

FP4 petaFLOPS

10.14

2.517

15

FP8 petaFLOPS

5.072

2.517

5

BF16 petaFLOPS

1.268

0.671

2.5

HBM Memory Size

216 GB HBM3e

144 GB HBM3e

288 GB HBM3e

HBM Memory Bandwidth

7 TB/s

4.9 TB/s

8 TB/s

TDP

750 W

???

1400 W

Bi-directional Bandwidth

2.8 TB/s

2.56 TB/s

1.8 TB/s bidirectional

As can be seen above, the Maia 200 offers a clear lead in raw compute power compared to the Amazon in-house competition, and raises an interesting conversation next to Nvidia's top dog GPU. Obviously, to compare the two as direct competitors is a fool's errand; no outside customers can purchase the Maia 200 directly, the Blackwell B300 Ultra is tuned for much higher-powered use-cases than the Microsoft chip, and the software stack for Nvidia launches it miles ahead of any other contemporaries.

However, the Maia 200 does beat the B300 in efficiency, a big win in a day where public opinion against AI's environmental effects is steadily mounting. The Maia 200 operates at almost half of B300's TDP (750W vs 1400W), and if it's anything like the Maia 100, it will operate beneath it's theoretical maximum TDP; Maia 100 was designed to be a 700W chip, but Microsoft claims it was limited to 500W in operation.

Maia 200 is tuned for FP4 and FP8 performance, focusing in on serving customers that are inferencing AI models hungry for FP4 performance, rather than more complex operations. A lot of Microsoft's R&D budget for the chip seems to have been put into the memory hierarchy that exists within its 272MB of high-efficiency SRAM bank, which is partitioned into "multi‑tier Cluster‑level SRAM (CSRAM) and Tile‑level SRAM (TSRAM)", accommodating increased operating efficiency and a philosophy of spreading workloads intelligently and evenly across all HBM and SRAM dies.

It's difficult to measure Maia 200's improvements over its predecessor Maia 100, as Microsoft's official stat sheets for both chips have nearly zero overlap or shared measurements. All we can say this early is that Maia 200 will run hotter than Maia 100 did, and that it is apparently 30% better on a performance-per-dollar metric.

Maia 200 has already been deployed in Microsoft's US Central Azure data center, with future deployments announced for US West 3 in Phoenix, AZ, and more to come as Microsoft receives more chips. The chip will be part of Microsoft's heterogeneous deployment, operating in tandem with other different AI accelerators as well.

Maia 200, originally codenamed Braga, made waves for its heavily delayed development and release. The chip was meant to have a 2025 release and deployment, maybe even beating B300 out of the gates, but this was not meant to be. Microsoft's next hardware release isn't certain, but it will likely be fabricated on Intel Foundry's 18A process, per reports in October.

Microsoft's efficiency-first messaging surrounding the Maia 200 follows its recent trends of stressing the corporation's supposed concern for communities near its data centers, taking great lengths to deafen the backlash to the AI boom. Microsoft CEO Satya Nadella recently spoke at the World Economic Forum on how if companies cannot help the public see the supposed perks of AI development and data center buildout, they risk losing "social permission" and creating a dreaded AI bubble.

Google Preferred Source

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Sunny Grimm
Contributing Writer

Sunny Grimm is a contributing writer for Tom's Hardware. He has been building and breaking computers since 2017, serving as the resident youngster at Tom's. From APUs to RGB, Sunny has a handle on all the latest tech news.