AVX-512 Patch Brings 30% Performance Uplift to PlayStation 3 Emulator

Whatcookie, a software developer behind RPCS3, a multi-platform open-source Sony PlayStation 3 emulator, has released a patch that makes use of AVX-512 instructions and brings a 30% performance improvement to the emulator. So far, AVX-512 instructions have not made much sense for games. But in the case of a PS3 emulator, a large register file of AVX-512-enabled hardware, data level parallelism, and the LLVM compiler can do wonders.

But before jumping in to how AVX-512 instructions make sense for RPCS3, something that Whatcookie explained in his detailed blog post, let's take a short dive in the recent history of computing.

AVX-512 also adds new mask registers which can be optionally used with EVEX encoded instructions,” wrote Whatcookie. “There are new comparison instructions which generate a mask in the mask registers as the result of a comparison between vectors. When a mask register is used as an operand all of the elements not selected by the mask will either be zeroed or leave the existing value in the destination register untouched. There are 8 mask registers, through k0 - k7, however only k1 - k7 can be used to mask things out, as k0 implicitly behaves as if all elements are selected.”

Anton Shilov
Contributing Writer

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

  • hotaru.hino
    I'd argue the only reason why Sony put the Cell in the PS3, outside of maybe hoping the FLOPS war would be the new bit-wars for uninformed consumers, was because Sony dumped billions into making a CPU and needed a product to show for it. Otherwise the only other market the Cell made sense in was the HPC market. And they did have a supercomputer built out of it, but that was about it.

    The PS4 using an x86 processor was out of developer feedback and well, making a brand new CPU platform in the 2010s didn't make sense anymore.
    Reply
  • setx
    As it turns out, the LVVM compiler automatically chooses the best possible code path, which in case of AVX-512-enabled hardware means an appropriate code path. For obvious reasons (we are talking about emulation here at the end of the day) it is not exactly ideal, not all mask registers can be used, for example.
    This makes no sense at all.

    Author, please don't try to look smart and re-phrase things you don't understand at all. Use proper quotes.
    Reply
  • user7007
    setx said:
    This makes no sense at all.

    Author, please don't try to look smart and re-phrase things you don't understand at all. Use proper quotes.

    I thought the same thing. The original post by the dev team did a good job explaining things if you have a basic understanding of assembly language and emulation.

    They also explained why the 30% even matters when it already runs fast, basically for laptops and handhelds that are clocked much lower and have battery life concerns. The upcoming Zen 4 is likely to be best choice given that the new intel cpus don't support avx-512 but I don't think we know yet whether AMD's upcoming implementation brings the same performance benefits as intel's current implementation.
    Reply
  • setx
    user7007 said:
    The upcoming Zen 4 is likely to be best choice given that the new intel cpus don't support avx-512 but I don't think we know yet whether AMD's upcoming implementation brings the same performance benefits as intel's current implementation.
    It should be the same benefits as article talks about how new commands are useful even when run on 256-width. And on Zen 4 the main unknown is how 512-width is done.
    Reply
  • user7007
    setx said:
    It should be the same benefits as article talks about how new commands are useful even when run on 256-width. And on Zen 4 the main unknown is how 512-width is done.

    I was just thinking some instructions could be higher or lower latency in AMD's version and maybe it ends up 25% faster or 35% faster or whatever.
    Reply
  • blppt
    "We are already above 120fps"

    In what game? Last I tried RDR on this emulator on a 9900KS and 2080ti, it wasn't anywheres near 60fps, never mind 120fps.
    Reply
  • segio526
    hotaru.hino said:
    I'd argue the only reason why Sony put the Cell in the PS3, outside of maybe hoping the FLOPS war would be the new bit-wars for uninformed consumers, was because Sony dumped billions into making a CPU and needed a product to show for it. Otherwise the only other market the Cell made sense in was the HPC market. And they did have a supercomputer built out of it, but that was about it.

    The PS4 using an x86 processor was out of developer feedback and well, making a brand new CPU platform in the 2010s didn't make sense anymore.
    If I remember correctly, you are 100% right. Cell was planned to be used in Sony TVs, Blu-ray players, and even high end cameras. At one time they even envisioned Cell as both the CPU and GPU for the PS3. Other than the PS3, I think the only commercial product to use one was an avc encoder accelerator add-on card from Toshiba, albeit with a cutdown, 4 core chip.
    Reply
  • vlbastos
    The only question here is: HOW do you enable AVX512 in Intel cores (yeah, I know, disable e-cores yadda yadda... HOW?).
    And another interesting question is: how do you know AVX512 is even being used?
    I'd appreciate some answers for my i7 11800h, which is supposed to have AVX512.
    Reply