High-end external GPUs still suffer a performance hit — OCuLink tests show up to a 23% drop with an RTX 4090

GeForce RTX 4090
GeForce RTX 4090 (Image credit: 金猪升级包/Weibo)

An external GPU (eGPU) lets you use the best graphics cards with slim and compact devices that lack graphics firepower. But it continues to be the case that you don't get the full graphics card performance, due to connection limitations. One Weibo user recently evaluated the performance hit that you take from using the OCuLink interface. For those not in the know, OCuLink is an optical and copper link first developed for PCIe connections more than a decade ago, which has mostly been used in the server realm. But it's been getting more attention on the consumer side lately, with products like GPD's Win Max 2.

eGPUs, essentially graphics cards housed inside a box or installed on a dock, have been around for ages. You can connect them to various devices, such as laptops, handheld gaming consoles, iMacs, or mini-PCs. Thunderbolt has typically been the preferred interface to connect eGPUs to other devices, but it's also expensive. Renewed interest in OCuLink has provided manufacturers with an alternate option to establish communication with the host device and the graphics card.

One of OCuLink's most appealing features is the higher bandwidth, meaning there is more room for the graphics card to stretch its legs. Thunderbolt 4 is limited to 40 Gb/s, whereas OCuLink enables a bandwidth of up to 64 Gb/s. That's a whopping 60% increase. So theoretically at least, you should get more performance over an OCuLink connection. But, Intel has already announced Thunderbolt 5, which is rated for up to 80 Gb/s in standard mode.

The Weibo user connected his dock via OCuLink to a laptop powered by the latest Intel Core Ultra 125H (Meteor Lake) processor. While he compares the performance to a desktop, he doesn't reveal the PC's specifications. So approach these performance numbers cautiously;  the reviewer is comparing a 28W chip to a much more powerful desktop processor. A chip from Intel's Raptor Lake Refresh HX series or AMD's Ryzen 7045 series (Dragon Range) would be a fairer comparison.

Swipe to scroll horizontally
Graphics CardTime Spy (Desktop)Time Spy (Dock, Internal Display)Time Spy (Dock, External Display)Time Spy Extreme (Desktop)Time Spy Extreme (Dock, Internal Display)Time Spy Extreme (Dock, External Display)
GeForce RTX 409036,48728,23030,42919,93018,90219,925
GeForce RTX 4070 Ti Super24,61322,199N/AN/AN/AN/A

Starting with the flagship GeForce RTX 4090 and the Time Spy benchmark, there was a 23% performance loss when using the internal display. However, with an external display, the performance hit was 17%. The performance penalty was lower with the Time Spy Extreme benchmark. The reviewer observed 5% lower performance with the internal display and zero performance loss with an external display. On the other hand, the GeForce RTX 4070 Ti Super only lost 10% of its performance when using the internal display.

So the results clearly show that using an external display minimizes the performance drop, because the data travels directly to the external display instead of having to return to the laptop -- this frees up bandwidth. It would be interesting to see the same tests with a more capable mobile processor. The Time Spy Extreme test with the GeForce RTX 4090 and the external display, which came back with zero performance hit, suggests a processor bottleneck.

OCuLink is becoming at least somewhat more mainstream. Even handheld gaming consoles like the GPD Win Max 2, are starting to come with the OCuLink connector. It's an interesting alternative for users who want to reuse their graphics card to help push gaming performance on mobile devices or mini-PCs without a discrete graphics card. Vendors have already started releasing eGPU docks like the GPD G1 and OneXGPU. Some even come with storage options to install an M.2 SSD.

But with Thunderbolt 5 also on the horizon, it will be interesting to see both how the two super-fast interfaces compare in real-world GPU testing, as well as how much traction OCuLink can gain before Intel's faster interface arrives en masse. TB5 is technically already here in 14th Gen Raptor Lake Refresh HX CPUs, and is expected to arrive on Arrow Lake desktop parts late this year. But those who want a slim and light computing experience on the go and a gaming beast when docked may have to wait until Lunar Lake or beyond to get their Thunderbolt 5 fix.

Zhiye Liu
News Editor and Memory Reviewer

Zhiye Liu is a news editor and memory reviewer at Tom’s Hardware. Although he loves everything that’s hardware, he has a soft spot for CPUs, GPUs, and RAM.

  • thestryker
    TB5 only has a PCIe 4.0 x4 link so its performance at best will match the current OCuLink implementations. It is possible the overhead from other bandwidth reservation will impact maximum throughout like it does in TB3/4 though there is more excess bandwidth on TB5 so it should at least be better.

    OCuLink can go up to an x8 link width, but this would be much harder to implement on a mobile platform.
    Reply
  • Li Ken-un
    Thunderbolt 4 is limited to 40 Gb/s, whereas OCuLink enables a bandwidth of up to 64 Gb/s. That's a whopping 60% increase.
    Thunderbolt 4 is not 40 gbps. You only get 32 gbps tops—the same as 4 lanes of PCIe 3.0—after accounting for overhead. Many hardware manufacturers and writers quote 40 gbps, but that erroneously lumps many other things sharing the line including the bandwidth for non-data purposes—video.

    With Thunderbolt 5, OCuLink 4i at PCIe 64 gbps could lose its advantage as bandwidth could be boosted to 150% in a single direction for read-intensive or write-intensive operations. The one remaining advantage it’d have is the possibility of an 8i connector for 128 gbps. But it’s going to be a dead-end connector unless a PCIe 5.0-based OCuLink standard is released. Another doubling of Thunderbolt’s speeds will render OCuLink irrelevant.
    Reply
  • Kamen Rider Blade
    thestryker said:
    TB5 only has a PCIe 4.0 x4 link so its performance at best will match the current OCuLink implementations. It is possible the overhead from other bandwidth reservation will impact maximum throughout like it does in TB3/4 though there is more excess bandwidth on TB5 so it should at least be better.

    OCuLink can go up to an x8 link width, but this would be much harder to implement on a mobile platform.
    It's a matter of willingness to make the x8 External Connector and getting enough LapTop/MoBo manufacturers on-board with it.

    https://sg.news.yahoo.com/theres-naming-scheme-future-pcie-162644788.html
    PCI-SIG has announced "CopprLink"
    But it's in development with a target spec release date of 2024 for PCIe 5.0 and PCIe 6.0 Internal and External Cable Specifications
    Reply
  • Colin Ionita
    Who would have though dividing your bandwidth into a quarter would limit performance. Oh wait, everyone.
    Reply
  • dipique
    Colin Ionita said:
    Who would have though dividing your bandwidth into a quarter would limit performance. Oh wait, everyone.
    Though I doubt anything outside the 4090 and to a lesser extent the 4080/3090 would have a noticeable difference.
    Reply