AMD Threadripper 3990X Scores Another Win: We Test New SPECWorkstation 3 Update

AMD Threadripper 3990X
(Image credit: Tom's Hardware)

The SPECwpc subcommittee has announced a new version of the SPECworkstation 3 benchmark for workstations to accommodate processors with more than 64 threads. That move certainly benefits the only workstation-class x86 processor with more than 64 threads in a single socket: AMD's monstrous 64-core 128-thread Threadripper 3990X. We decided to put the new release to the test to see the full threaded might of the 3990X in action, and the results are impressive.

Upsetting the semiconductor industry is hard, particularly when you're fighting an entrenched and much-larger rival, and sometimes things get broken when you're redefining an industry. In AMD's case, those broken things consist of operating systems and applications that weren't tuned to extract the full performance of its fledgling first-gen Zen architecture, let alone the core-heavy designs of Zen 2. The 64-core 128-thread Threadripper 3990X is the best example because it more than doubles the number of cores available with Intel's halo workstation parts and sets an entirely new bar for single-socket systems. 

The 3990X exposes just how unprepared applications and software are for a 128-thread beast: Windows 10 splits cores up into 'processor groups' of 64 threads apiece, so some applications and benchmarks that aren't tuned to span across the groups don't benefit from the increased thread count. 

Aside from the obvious performance loss in unoptimized applications, this is important because the problem has the knock-on effect of impacting industry benchmarks and applications that are used to quantify performance for OEMs, ODMs, and end customers. These benchmark results, like the ones developed and maintained by the non-profit Standard Performance Evaluation Corporation (SPEC), are submitted by companies that buy the benchmark ($5,000 for the tests in question), and then submit the results to SPEC. The committee charges a $500 to $1,000 fee to verify each result and then publish it in its database. 

These benchmarks are verified and maintained by an official group comprised of industry leaders, so the test results help guide the development dollars and purchasing decisions for OxMs, like Dell, HP, and Lenovo (among others), and corporate customers alike. As a result, positioning a product in the best and most accurate light possible is critical for sales to professional organizations that comprise the target market for the Threadripper 3990X. 

Press and end users, like yourself, that "aren't affiliated with a for-profit entity that sells computers in the commercial marketplace" can use the benchmarks for free, so you can download the tests and use them yourself today. 

We test with the SPECworkstation 3 suite for our own tests of workstation-class processors (but we don't submit them to the official body), and noted in our recent review that many of the SPEC benchmarks we typically use didn't scale correctly across processor groups. In fact, most of the benchmarks only operated at half of the potential performance, and some results were unusable. That could present an issue for AMD with some professional customers because the benchmarks aren't representative of the 3990X's true performance and potential. 

However, the alterations to the existing SPECworkstation 3 benchmark allow the tests to scale correctly across different processor groups (and multi-socket systems) as of the new version V3.0.4, which has an enhanced version of the multi-threading code. Results from the update are comparable to older 3.0 versions of the benchmark, which is important to ensure the new tests can be published to the existing database. It's also great because we can plug-and-play the new benchmark for some direct comparisons, while also providing updated performance results. 

(Image credit: Tom's Hardware)

The SPECworkstation 3 update is available as either a new complete download or as a patch, and is available today. 

After debugging why we couldn't extract the utmost performance of the 3990X in our first round of SPECworkstation 3 tests, it's truly a wonderful sight to unleash the beast and finally see all 128 of those threads utilized in the subtests that can now leverage them. Let's see what that looks like on the following page.

MORE: Best CPUs

MORE: Intel & AMD Processor Hierarchy

MORE: All CPUs Content

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • atmapuri
    The benchmark is now optimized for the CPU, but the actual software not yet? What is that good for? And they even want to do more code which runs well for benchmark only. Maybe time for a new "dreamer" category of articles?
    Reply
  • PaulAlcorn
    atmapuri said:
    The benchmark is now optimized for the CPU, but the actual software not yet? What is that good for? And they even want to do more code which runs well for benchmark only. Maybe time for a new "dreamer" category of articles?

    The majority of the benchmarks do show massive performance improvements, though a few aren't scaling due to the nature of the program. If the program can't scale in real life, or isn't designed to, it doesn't benefit anyone to game the benchmark software to present unrealistic results.
    Reply
  • CerianK
    For those of us that write and run custom parallel code applications, these benchmarks shed some more light on the potential. However, there is no substitute for actually testing the applications you plan to use. I have already jumped ahead and finished testing the very applications I need to run on a 3800X to project the performance improvement over my dual E5-2690 workstation when moving to a single 3990X: > 5x.

    This is indeed a niche processor for those that know they need it, and I'm sold. But since I am a private researcher, maybe I should go after that $9000 prize mentioned to defray costs... I might have a little extra time available once I finish building new automation tools to keep the beast fed.

    The main down-side I see for my work is that, ideally, I would need to have access to 4GB per thread... AMD needs to get that mess straightened out, if not for this generation, then the next. As is, some of my workloads would take twice as long as otherwise necessary, and no, I will not build two 3970X systems, go with EPYC, or upgrade to yet more Xeons... too cost and/or power prohibitive (thought the 3990X is certainly pushing the line for cost here also).

    Question: Do some of the 3970X (and possibly 3960X) benchmarks need to be re-run also, or is the funny scaling for some of those benchmarks due to glitches in SPEC's new code adjustments?
    Reply
  • Stevemeister
    You seem to have completely missed the point of the article which was to point out that currently most applications ARE NOT optimized to take advantage of what this chip is capable of IF applications were to be optimized for it . . . . basically there is potential for applications to run 2-3 times faster than they currently do if the software get optimized.
    Reply
  • RodroX
    Nvidia launched the RTX gpus more than a 17 months ago, and the Ray Tracing tech is still not supported by many games and been optimized little by litte, and those games that do support it see an important drop in FPS when active (but they do look amazing).

    AMD launched a new category of HEDT cpu the TR 3990X less than a month ago, so I think is fair to give some time for the software industry to catch up.
    Other than that tumbs up! to TomsHardware to keep updating the info and benchmarks results as new software shows up. How knows what the cpu and gpu future brings when software gets tunned a bit more.

    Cheers
    Reply
  • Rob1C
    CerianK said:
    The main down-side I see for my work is that, ideally, I would need to have access to 4GB per thread... AMD needs to get that mess straightened out, if not for this generation, then the next. As is, some of my workloads would take twice as long as otherwise necessary, and no, I will not build two 3970X systems, go with EPYC, or upgrade to yet more Xeons... too cost and/or power prohibitive (thought the 3990X is certainly pushing the line for cost here also).

    On the basis of memory cost alone for your application (4GB / Thread) the Epyc CPU is well over $1000 less expensive when you add the price of a 64 core ThreadRipper with 4x128GB sticks versus the Epyc with 8x64GB sticks; while there's a difference in clock speed the 7H12 benefit for the additional cost is unlikely useful for your cost constraints. The extra 64 PCIe 4.0 lanes could allow a speedy RAID card which might be useful.

    Sometimes looking at total costs rather than focusing on the price, longevity, and capabilities of a single part is what's needed. It's a certainty that there's a much better selection of ThreadRipper MBs (PCIe 4.0) than what is available for the Epyc; and the TR MBs are more feature filled and capable for the price. The best TR MB won't add epic features to a ThreadRipper, nor is there an overclocked server MB for the Epyc (not counting the normal running speed for a dual 7H12, and the loss of arm, leg, and organs).

    But, buy as you wish.
    Reply
  • Makaveli
    RodroX said:
    Nvidia launched the RTX gpus more than a 17 months ago, and the Ray Tracing tech is still not supported by many games and been optimized little by litte, and those games that do support it see an important drop in FPS when active (but they do look amazing).

    AMD launched a new category of HEDT cpu the TR 3990X less than a month ago, so I think is fair to give some time for the software industry to catch up.
    Other than that tumbs up! to TomsHardware to keep updating the info and benchmarks results as new software shows up. How knows what the cpu and gpu future brings when software gets tunned a bit more.

    Cheers

    Don't think we will see a push for Ray Tracing in games until both next Gen consoles are out.
    Reply
  • CerianK
    Rob1C said:
    On the basis of memory cost alone for your application...
    Actually, it is applications (plural), where the 4GB/thread applies to only one of the applications, so not a deal-breaker if only 2GB/thread is available. The concern is more of a future-proofing issue, where you are right that EPYC might be a better choice in the long run, but would also require an additional 3950X (for example) to handle more lightly threaded time-sensitive workloads. So, not necessarily a less-expensive option.

    Still, my understanding is that if one were able to install 8x64GB on Threadripper, it would only see 256GB, which others have commented on as a means to artificially limit VM deployment, or other traditionally server (i.e. EPYC) workloads, on the 3990X. I am not sure if there is any more to it than that, but it makes sense from a marketing standpoint (regardless of my opinions on the subject).

    Regarding memory type, ECC is not a requirement for me. Something like G.SKILL F4-3200C16Q2-256GVK for $1200 US would likely be fine, from what I can tell.
    Reply