Machine hangs on using all cores

abhishekq

Distinguished
Mar 12, 2010
1
0
18,510
I have two quad core machines(Having Fedora(11-12) fully updated as OS) and i use them for running parallel programs using mpirun. Now if i use all 4 cores the process hangs within some 800 iterations alsmost 15-20 minutes on both the machines but if i run it using 3 cores it continues to run for days.

It just hangs showing 100% processor usage for all processors and no output. Has anyone experienced the same problem. Any suggestions in this regard will be very useful.

I dont know weather its a hardware or a software problem.
 

festerovic

Distinguished
could be hardware but I doubt it. I get thread starvation causing failures like this when running 100% on all cores. I would suggest VM or install another OS to see if the problem is present on different platforms, that could rule out hardware. Also, is there any chance that your memory allocation hits a certian amount when all cores are running vs. only 3? Perhaps run some memory diagnostics too.