Skip to main content

Linus Torvalds Wishes Intel's AVX-512 A Painful Death

(Image credit: Shutterstock)

According to a mailing list post spotted by Phoronix, Linux creator Linus Torvalds has shared his strong views on the AVX-512 instruction set. The discussion arose as a result of recent news that Intel's upcoming Alder Lake processors reportedly lack support for AVX-512.

Torvalds' advice to Intel is to focus on things that matter instead of wasting resources on new instruction sets, like AVX-512, that he feels aren't beneficial outside the HPC market.

AVX-512 support debuted in Intel's Xeon Phi x200 (codename Knights Landing) processor in 2016. However, the instruction set later made its way into the chipmaker's other offerings, such as Skylake-SP, Skylake-X, Cannon Lake and Cascade Lake. Currently, Intel's both Cooper Lake and Ice Lake processors support certain AVX-512 subsets. While Alder Lake seemingly lacks AVX-512, the chipmaker has confirmed that Tiger Lake will exploit the instruction set.

We've included a copy of Linus Torvalds' opinion on AVX-512 below:

I hope AVX512 dies a painful death, and that Intel starts fixing real problems instead of trying to create magic instructions to then create benchmarks that they can look good on.

I hope Intel gets back to basics: gets their process working again, and concentrate more on regular code that isn't HPC or some other pointless special case.

I've said this before, and I'll say it again: in the heyday of x86, when Intel was laughing all the way to the bank and killing all their competition, absolutely everybody else did better than Intel on FP loads. Intel's FP performance sucked (relatively speaking), and it matter not one iota.

Because absolutely nobody cares outside of benchmarks.

The same is largely true of AVX512 now - and in the future. Yes, you can find things that care. No, those things don't sell machines in the big picture.

And AVX512 has real downsides. I'd much rather see that transistor budget used on other things that are much more relevant. Even if it's still FP math (in the GPU, rather than AVX512). Or just give me more cores (with good single-thread performance, but without the garbage like AVX512) like AMD did.

I want my power limits to be reached with regular integer code, not with some AVX512 power virus that takes away top frequency (because people ended up using it for memcpy!) and takes away cores (because those useless garbage units take up space).

Yes, yes, I'm biased. I absolutely destest FP benchmarks, and I realize other people care deeply. I just think AVX512 is exactly the wrong thing to do. It's a pet peeve of mine. It's a prime example of something Intel has done wrong, partly by just increasing the fragmentation of the market.

Stop with the special-case garbage, and make all the core common stuff that everybody cares about run as well as you humanly can. Then do a FPU that is barely good enough on the side, and people will be happy. AVX2 is much more than enough.

Yeah, I'm grumpy.


Torvalds, who was once an Intel user, recently saw the light and crossed over to the Red Team. His Ryzen Threadripper 3970X accelerated his workloads by threefold.

  • PCWarrior
    Says a person who also doesn’t care about GPU acceleration. He considers these kind of things “only relevant to HPC”. As if the HPC market is small and not where all the money is atm. And who writes code for HPC Mr Linus? Isn’t it the same people who also use more consumer/prosumer-oriented hardware at home or at office? Don’t they code first on such hardware and then they run it on the big supercomputer? Linus seems to be either stack in the past or he is simply too narrow-minded these days and only cares about what he is personally involved with.
  • jkflipflop98
    This guy is a moron.

    Maybe Linus should get back to dallying about with his crappy operating system he literally can't give away for free and leave chip architecture to the real engineers.
  • InvalidError
    PCWarrior said:
    Don’t they code first on such hardware and then they run it on the big supercomputer? Linus seems to be either stack in the past or he is simply too narrow-minded these days and only cares about what he is personally involved with.
    Well, Linus was mostly a kernel developer and kernel stuff has little to no use for FP and high-volume math in general. With the amount of AI and AI-derived stuff making its way to the consumer market, having a decent amount of low-latency FP8/12/16 performance may become very much necessary even for normal people a few more years from now.
  • jwgrace1
    As a dabbler in X86 Assembly code, I would love to have access to AVX512 instructions that can clear large memory blocks quickly, that can do exclusive or's, and a few other similar 512 bit wide instructions efficiently. But I do not use other parts of the code at this time. I appreciate Linus Torvalds comments in at least one way. These commands do take more energy slowing down clock speeds etc. but I expect that over time these problems would get resolved. I do wish the 512bit registers could be used just as the 64 bit registers, so that I could get single instruction 128x128 bit or 256 x 256 bit or 512 x 512 bit multiplications, or any other of the integer arithmetic instructions. (I have an interest in complexity theory where such arithmetic would be useful.) In fact my dream is to have a cpu that we would call x86-512 on the internal side and have its minimal io be64 bitds as at present but expandable to 128 bit data bus, then 256 bit data bus, then 512 bit data bus as technology can efficiently deal with those requirements. Oh well, perhaps never to be, but it is fun to dream of.