EPYC Linux instability

Jul 21, 2018
6
0
10
I am searching for a good debate about segfault crashes with Linux on EPYC chip, as I have experienced system instabilities recently.
There is a lengthy thread about problems with Ryzen that might extend to the EPYC at
https://community.amd.com/thread/215773?start=1890&tstart=0
I have seen threads suggesting to tweak the memory voltage to avoid a problem with idle CPU
But AMD says EPYC is supported with various versions of linux
https://community.amd.com/message/2849516
System's description:
CPU: EPYC 7351P
Mobo: Supermicro H11SSL-NC (single chip)
OS: Ubuntu 18.04
Kernel: 4.15.0-29-generic
I will be deeply grateful for any pointer or suggestion.
hj
 

Lutfij

Titan
Moderator
Mind sharing which EPYC chip you're working with? Your motherboard should have BIOS updates to address the issue, outside of AMD's driver support(for the chipset and accompanying OS). How high of a voltage are you being suggested?
 
A segfault is an invalid memory access. Often it is just an individual application or the RAM. Run memtest86 on it overnight. Don't just run it for a few hours.

Memtest86 is its own mini operating system and will be independent of any Windows or Linux software. You can make a bootable USB thumb drive or bootable DVD to run it from.
 
Jul 21, 2018
6
0
10


Thanks Luftij, I updated the initial question.
The BIOS is already at latest version that I can see 1.0b.
The voltage in the posts was above 1.1, I verified that the voltage sensor value on the RAM is 1.17, so this should not be a problem. I posted a more detailed account of the episode on launchpad.
 
Jul 21, 2018
6
0
10

Thanks LinuxDevice. This is a server, no hibernation is in place, only proper shutdown and reboot are used (that I know of of course).
Cheers, hj