AMD’s Linux graphics driver is getting too big for older machines
Graphical boot menus are timing out before AMDGPU can load.
Linux users are complaining that the AMD graphics driver for their OS of choice is getting too big and is now causing issues. Red Hat Linux desktop engineer Hans de Goede highlighted this issue at the weekend, reports Phoronix. De Goede’s blog post describes the problem and helpfully outlines some workarounds for people who may be afflicted with the same or similar boot issues.
There have been several reports of boot splash issues posted on the Red Hat Bugzilla bug tracking system, sparking the attention of the Red Hat engineer. In brief, these issues are due to the Plymouth graphical boot experience not loading correctly on older hardware. Thus users experience a boot splash timeout, seeing just three dots on screen, and can find that they are stuck at this early point in the boot process. Plymouth has long been a default application in Fedora, Ubuntu, Debian, Linux Mint, and other distros.
De Goede indicates that the sheer size of the AMD graphics driver for Linux, simply referred to as AMDGPU, is behind the three dots problems people report. The AMD graphics driver is reputedly the biggest that mainstream Linux users will encounter, approaching six million lines of code. The time required to load and execute this mass of code means that users of older systems with Radeon graphics can experience a boot splash screen timeout before the GPU is initialized. The timeout is set to trigger in 10 seconds by default, not enough time for older/slower systems to load the driver.
So what are users experiencing this timeout to do? From the hardware side perhaps migrating to some faster storage might be a worthwhile upgrade. On the software side, and if users don’t want to or can’t upgrade hardware, then De Goede describes two options.
The Red Hat engineer’s first suggestion is to check if your system actually needs AMDGPU, and if not it is a simple task to disable its loading. Another option, for those who have systems that actually rely on this driver, is to redirect Plymouth to render via the SimpleDRM DRM/KMS device, which means no timeout waiting for the AMDGPU driver to load. De Goede provides sample command line instructions for you to copy, to accomplish these tasks.
Last but not least, Phoronix notes that the latest Fedora packages are working around the Plymouth boot timeout issue by immediately probing for SimpleDRM. Perhaps other distros will follow suit.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Mark Tyson is a news editor at Tom's Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.
-
OneMoreUser Without at least a couple of examples of what constitutes a "older/slower systems" this article is pretty worthless.Reply -
razor512 A good solution would be to find a way to get the OS to detect some of the key hardware that would be too slow to load larger drivers in time, and then extend the timeout to 20-25 seconds. And then indicate that it is still loading GPU drivers so that the user knows what the delay is.Reply -
mitch074
That has been explored in the bugs reports (as in, build most used GPU drivers in the kernel image), however some graphics drivers (Intel i915) have a problem with that as they expect to be loaded after some other kernel modules, and also, it makes for very large kernel images that will fail on low memory systems - something that distros like Debian wouldn't apply. Another more long term solution would be to optimize the driver's linking behavior, because while it's huge, most of its weight comes from automatically generated header files - as such, the most likely problem is on systems that still use spinning rust devices as a boot disk, because once loaded, the DRM driver checks for what it actually needs to load (chip-specific driver content and firmware) and discards the rest. This is also what allows it to recover in case of user-space driver crash.razor512 said:A good solution would be to find a way to get the OS to detect some of the key hardware that would be too slow to load larger drivers in time, and then extend the timeout to 20-25 seconds. And then indicate that it is still loading GPU drivers so that the user knows what the delay is.
The current most efficient workaround is, indeed, to use SimpleDRM for the boot loader - for PC, think a lightweight VESA driver that's used until the actual driver is loaded and takes over. It may cause a screen blink when the "final" driver takes over display tasks. It's already what happens with Nvidia proprietary driver.
Note that users don't get "stuck" with three dots on the screen : it simply means that Plymouth, the boot loader screen manager, times out acquiring a graphical display and stays in text mode until the boot process is done. In most cases, you can still press Esc to access the terminal boot process dump.
So, all in all, this is NOT critical - it's simply the first user-affecting bug caused by the AMDGPU's driver size ballooning. Note that this driver supports all GCN and RDNA GPUs that came out since, well, GCN 1 - the radeon DRM driver covers GCN 1 and GCN 2 while AMDGPU has only experimental support for them, but AMDGPU is also a requirement for Vulkan support.