Program used was FloWorks and CFDesign. RAM usage under XP x64 was ~4.5GB of 6GB. See my i7 build specs in my sig. No idea about the program break down as it's closed source.
Mr. Shadow:
The fact that you can utilize 4.5 GB of Non-ECC RAM for 12 solid hours without any serious errors seems to indicate that the problem might not be as severe as I initially thought, but I still think I am going to get an ECC system.
Without going off on the deep end, I wish to say that the reincorporation of ECC into the common PC is inevitable. Higher Chip Density [1][2], Faster Speeds [1][2] and Lower Voltage [1] are all potential contributors to soft memory errors. With that said, I think that systems using the new i7 and Xeon 3500/5500 processors are pushing the limit.
Some approximations for the soft error rate of RAM that I have seen are:
1) "...one bit error, per month, per gigabyte of memory." [2].
2) "Chances for a single-bit soft error occurring are about once per 1GB of memory per month of uninterrupted operation." [3].
3) "...a system with 1 GByte of RAM
can expect an error every two weeks..." [1]
Approximations 1 and 2 are the same, but approximation 3 is 2x higher than that of 1 and 2. Because the more conservitive approximation of 1 bit error per GB per month of continuous operation was seen in two different sources, we will assume (bad thing to do) that it is correct.
Now since your CAD session uses about 4.5 GB, we would expect 4.5 bits to be corrupted per month of continuous operation.
Now since your CAD session runs for 12 solid hours, we would expect the following error rate:
(4.5 Errors per month) * (1 month per 31 days) * (1 day per 24 hours) * (12 hours per sesion) =
0.07 (Errors per Session)
This translates to about a 7% chance of something
going wrong with your CAD session; however, you have reported higher sucess rates. According to what you have represented, your rendering session does not actually fail (7/7)[failures] per (100/7)[number of tries] = 1 / 14 or 1 failure every fourteen attempts to render.
So perhaps my decision to simply go off the 1 bit per month per Gig statistic is unwaranted, and I should actually put more faith in what real world user's like yourself tell me, but something deep down inside me tells me that a allowing a memory chip to spontaneously change its value is a bad thing...
[1] - Soft Errors in Electronic Memory - A While Paper by Tezzaron Semiconductor [2004]
http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf
[2] - Dynamic Random Access Memory by Wikipedia [Last Update 5-SEP-2009?]
http://en.wikipedia.org/wiki/Dynamic_random_access_memory#Errors_and_error_correction
[3] - Do I Need ECC and Registered Memory? by Newegg??? [Unknown]
http://images10.newegg.com/UploadFilesForNewegg/itemintelligence/NI_System-Memory/NIC-Pro-Do_I_Need_ECC_and_Registered_Memory-v1.1e.doc
[BTW: I appreciate your info, your usage statistics are helpful]
Ah.... I see what you mean. Under linux/Windows(?) a change for file permission bit could cause problems....
I went overboard, if I remember right, a typical Linux kernel and drivers should be less than 1 Meg in size so it should be pretty immune to soft errors at run time. A more likely problem would be that I bit would be flipped during either compilation or the Linux kernel or compilation of the compiler used to compile the Linux kernel [1]. Another problem is bit errors while X Windows is running or being compiled. My understanding is that X Windows has elevated privileges compared to other system programs.
[1] - Compilation of the GCC compiler is normally done by the sponsor's of the distribution; however, if you are crazy like me and want to try Linux From Scratch then you will find your self compiling GCC more than once...