Radeon Instinct MI100 Arcturus Early Specifications Show Impressive 200W TDP

Radeon Instinct Accelerator (Image credit: AMD)

It's been a while since we've seen any leaks regarding Arcturus, AMD's rumored upcoming professional accelerator. Respected hardware leaker @KOMACHI_ENSAKA has shared what appears to be the potential specifications for the Radeon Instinct MI100, which is reportedly based on the Arcturus silicon.

An early prototype of the Radeon Instinct MI100 suggests that the accelerator utilizes a variant (D34303) of the Arcturus XL die. It seemingly runs with a 1,090 MHz base clock and 1,333 MHz boost clock. There is no mention of the number of Stream Processors (SPs) on the Radeon Instinct MI100, but the Arcturus silicon is rumored to carry up to 128 Compute Units (CUs), which would equal to a whopping 8,192 SPs.

It's not carved in stone, but the model name usually holds some clue to the accelerator's performance numbers. The Radeon Instinct MI60 and MI50 accelerators offer up to 58.9 TFLOPS and 53 TFLOPS of peak INT8 performance, respectively. Therefore, it's sound to assume that the Radeon Instinct MI100's INT8 performance should scale up to the 100 TFLOPS mark.

Swipe to scroll horizontally

Header Cell - Column 0	Instinct MI100*	Instinct MI60	Instinct MI50 (32GB)	Instinct MI50 (16GB)
Architecture (GPU)	Arcturus (Arcturus XL)	GCN 5.1 (Vega 20)	GCN 5.1 (Vega 20)	GCN 5.1 (Vega 20)
Compute Units	?	64	60	60
Stream Processors	?	4,096	3,840	3,840
Peak Half Precision (FP16) Performance	?	29.5 TFLOPS	26.5 TFLOPS	26.5 TFLOPS
Peak Single Precision (FP32) Performance	?	14.7 TFLOPS	13.3 TFLOPS	13.3 TFLOPS
Peak Double Precision (FP64) Performance	?	7.4 TFLOPS	6.6 TFLOPS	6.6 TFLOPS
Peak INT8 Performance	?	58.9 TFLOPS	53 TFLOPS	53 TFLOPS
Memory Size	32GB	32GB	32GB	16GB
Memory Type	HBM2	HBM2	HBM2	HBM2
Memory Clock	1 GHz - 1.2 GHz	1 GHz	1 GHz	1 GHz
Memory Interface	?	4096-bit	4096-bit	4096-bit
Memory Bandwidth	?	1,024 GBps	1,024 GBps	1,024 GBps
Total Board Power	200W	300W	300W	300W

*Specifications are unconfirmed.

The Radeon Instinct MI100 will reportedly show up with 32GB of HBM2 memory that could operate at 1 GHz or 1.2 GHz. The Radeon Instinct MI60 and MI50 have their memory running at 1 GHz across a 4,096-bit memory interface to provide a memory bandwidth up to 1,024 GBps.

If the Radeon Instinct MI100 retains the 4,096-bit memory bus and 1 GHz memory, it would deliver the same level of memory bandwidth as the Radeon Instinct MI60 and MI50. However, if the memory is clocked at 1.2 GHz, the Radeon Instinct MI100 can supply a memory bandwidth up to 1,229 GBps.

The leaker highlights that the test board for the Radeon Instinct MI100 is rated for 200W, however, the final product could vary. Assuming that AMD maintains this value, the Radeon Instinct MI100 would be a very efficient performance monster considering that the existing Radeon Instinct MI60 and MI50 conform to a 300W TBP (Total Board Power). A 100W improvement almost sounds too good to be true, but we're crossing our fingers that AMD can pull it off.

AMD reorganized its Radeon Instinct accelerator product stack a few months ago. The chipmaker relegated the Radeon Instinct MI60, which was the previous flagship, to a request basis. The Radeon Instinct MI50 (32GB) has since taken the Radeon Instinct MI60's place on the throne. However, AMD will, in all likelihood, pass the flagship mantle down to the Radeon Instinct M100 once the Arcturus-powered accelerator debuts.

See more GPUs News

TOPICS

Zhiye Liu is a news editor, memory reviewer, and SSD tester at Tom’s Hardware. Although he loves everything that’s hardware, he has a soft spot for CPUs, GPUs, and RAM.

7 Comments Comment from the forums

bit_user

Phoronix has long reported that open source driver patches indicate Arcturus will lack any 3D graphics hardware engines. This is to be a pure compute-accelerator die.

https://www.phoronix.com/scan.php?page=news_item&px=AMD-Arcturus-Linux-5.5

The Radeon Instinct MI60 and MI50accelerators offer up to 58.9 TFLOPs and 53 TFLOPs of peak INT8 performance
Two issues, here. Both nit picks, but it's what I do.
The 'S' in TFLOPS should be capitalized, because it's part of the unit: Trillion Floating Point Operations Per Second.
When describing integer performance, you omit the "Floating Point" part, leaving just TOPS.I noticed the table repeats these same errors. TFLOPs should be either TFLOPS or TOPS, depending on the row.
Reply
alextheblue

bit_user said:
Phoronix has long reported that open source driver patches indicate Arcturus will lack any 3D graphics hardware engines. This is to be a pure compute-accelerator die.
It could be built on Vega or Vega-derived CUs (as rumored), and they disable anything unnecessary. Only time will tell. Not sure how much I buy any of these rumored specs though.
Reply
bit_user

alextheblue said:
It could be built on Vega or Vega-derived CUs (as rumored), and they disable anything unnecessary. Only time will tell. Not sure how much I buy any of these rumored specs though.
Lisa Su has acknowledged that AMD will be pursuing a bifurcated strategy of HPC and consumer products. We've already seen the beginnings of this, with Vega 20, however it makes sense that they could go further.

I'm not really sure what would be gained by disabling "anything unnecessary", as they already do clock gating that dynamically powers down parts of the chip that are idle. I've heard estimates that graphics hardware blocks consume up to 25% of their die space, which would only increase with things like ray tracing and some of their other recent additions (DSBR, mesh shaders, etc.). So, the incentive is there to reclaim that for general-purpose compute hardware.
Reply
alextheblue

bit_user said:
Lisa Su has acknowledged that AMD will be pursuing a bifurcated strategy of HPC and consumer products. We've already seen the beginnings of this, with Vega 20, however it makes sense that they could go further.
Yeah that's been the case for them internally for a while now I suspect, given RDNA's focus. It remains to be seen how many resources they will throw at each design.

bit_user said:
I'm not really sure what would be gained by disabling "anything unnecessary", as they already do clock gating that dynamically powers down parts of the chip that are idle. I've heard estimates that graphics hardware blocks consume up to 25% of their die space, which would only increase with things like ray tracing and some of their other recent additions (DSBR, mesh shaders, etc.). So, the incentive is there to reclaim that for general-purpose compute hardware.
I'm not sure if there's anything else in the chip related to that which they could gate, that they aren't already. I mostly meant if they are using existing designs, and the graphics hardware is present, they aren't exposing it in the drivers. I know there's incentive to get rid of the superfluous blocks, but I don't know if we're going to see a redesign like that so soon. Especially since that piece of silicon couldn't also be used in professional graphics cards. What would they use for those? RDNA? Older Vega? Or would they end up with three designs? Who knows at this stage. :p
Reply
bit_user

alextheblue said:
Especially since that piece of silicon couldn't also be used in professional graphics cards. What would they use for those? RDNA? Older Vega?
Nvidia and AMD both offer workstation cards that mirror their consumer range, even reusing the same consumer chips, but with a few professional features enabled.

alextheblue said:
Or would they end up with three designs? Who knows at this stage. :p
Nvidia's P100 and V100 are good examples, here. I suspect neither saw much use as actual graphics cards. They were simply too expensive and didn't offer enough performance advantage vs. the top-end consumer GPUs.

I think the Titan V was mainly sold as a lower-cost deep learning accelerator. I'm betting most people who bought them weren't using them for gaming or other graphics tasks.
Reply
alextheblue

bit_user said:
Nvidia and AMD both offer workstation cards that mirror their consumer range, even reusing the same consumer chips, but with a few professional features enabled.
I know. I was specifically referring to a hypothetical graphics-less piece of silicon. They wouldn't be able to use that silicon in a pro card, which puts their future workstation cards in an interesting position. Would they be using GCN, or some variant of RDNA? If GCN, a new generation, or a rehash?

bit_user said:
Nvidia's P100 and V100 are good examples, here. I suspect neither saw much use as actual graphics cards. They were simply too expensive and didn't offer enough performance advantage vs. the top-end consumer GPUs.

I think the Titan V was mainly sold as a lower-cost deep learning accelerator. I'm betting most people who bought them weren't using them for gaming or other graphics tasks.
Those both had Quadro models obviously, but I'm not sure how big the market is for that level of graphical capability. I wouldn't have suggested anyone game on these, nor do I suspect many people bought a Titan V for gaming. My point is ditching the graphics would limit the market for that particular design, and having additional layouts costs them precious resources. Not sure if it's worth it. Guess we'll find out.
Reply
bit_user

alextheblue said:
I know. I was specifically referring to a hypothetical graphics-less piece of silicon. They wouldn't be able to use that silicon in a pro card,
Right, which is how we ended up with the example of Quadro P100 and V100.

alextheblue said:
Those both had Quadro models obviously, but I'm not sure how big the market is for that level of graphical capability. I wouldn't have suggested anyone game on these, nor do I suspect many people bought a Titan V for gaming.
Yeah, and I'm trying to say that the graphics performance for even professional graphics is nearly as lousy for the $. I just don't believe the majority of these get purchased for graphics. Aside from V100 being used to prototype their interactive ray tracing, I'm betting most Quadro P100 and V100 cards are purchased for deep learning or GPU compute. You can't justify their price any other way.

alextheblue said:
My point is ditching the graphics would limit the market for that particular design, and having additional layouts costs them precious resources. Not sure if it's worth it. Guess we'll find out.
If almost nobody is buying them for graphics tasks, then the size of the market you're foreclosing is almost zero.

Besides, consider this: AMD could still put the display driver and video codec engine (which you need to be competitive at using it for video processing). Even if they drop the rest of the hardware assist, they could still emulate all of that stuff in software. So, one could still use it as a graphics card. I doubt they'll go to all of that trouble, but it's possible.
Reply

Show more comments