Anyone really know how ECC ram works?

G

Guest

Guest
Does ECC really CORRECT data, or does it only know whether the data it has is corrupt? Does it actually provide greater stability, or just error reporting?

Is the speed difference only 1 or 2%?

Can you disable ECC mode when you don't want to use it to get it back?

How often does a memory error occur per month with normal memory?
 

bum_jcrules

Distinguished
May 12, 2001
2,186
0
19,780
"Does ECC really CORRECT data, or does it only know whether the data it has is corrupt? Does it actually provide greater stability, or just error reporting?"

Answer is both. It can correct single bit errors and will notify of larger than single bit errors.

Answer to stability issue. Yes it is more stable due to its name.

ECC
(Error Correction Code) - A method of checking the integrity of data in DRAM. ECC provides more elaborate error detection than parity; ECC can detect multiple-bit errors and can locate and correct single-bit errors.
<A HREF="http://www.kingston.com/tools/bits/bit13.asp" target="_new">- Taken Directly from Kingston's Website -</A>

61
THE ULTIMATE MEMORY GUIDE KINGSTON TECHNOLOGY DIFFERENT KINDS OF MEMORY

ERROR CHECKING

Ensuring the integrity of data stored in memory is an important aspect of memory design. Two primary means of accomplishing this are parity and error correction code (ECC).
Historically, parity has been the most commonly used data integrity method. Parity can detect – but not correct – single-bit errors. Error Correction Code (ECC) is a more comprehensive method of data integrity checking that can detect and correct single-bit errors.
Fewer and fewer PC manufacturers are supporting data integrity checking in their designs. This is due to a couple of factors. First, by eliminating support for parity memory, which is more expensive than standard memory, manufacturers can lower the price of their computers. Fortunately, this trend is complemented by the second factor: that is, the increased quality of memory components available from certain manufacturers and, as a result, the relative infrequency of memory errors.
The type of data integrity checking depends on how a given computer system will be used. If the computer is to play a critical role – as a server, for example – then a computer that supports data integrity checking is an ideal choice. In general:
• Most computers designed for use as high-end servers support ECC memory.
• Most low-cost computers designed for use at home or for small businesses support
non-parity memory.
<A HREF="http://www.kingston.com/tools/umg/default.asp" target="_new">- This was also take right off of Kingston's Website -</A>

The memory is just memory. In general, memory doesn't do parity or ECC, that is done by memory control logic on the motherboard, in the chip set, or (as in the 21066) in the CPU.

The memory only has to store the parity or ECC bits, just as it stores the data bits.

Parity is implemented on most PCs with one parity bit per byte. For a 32-bit word size there are four parity bits, for a total of 36 bits that have to be stored in the memory. On most Pentium and Pentium Pro systems, and a few 486 systems, there is a 64-bit wide memory data path, so there are eight parity bits, for a total of 72 bits.

When a word is written into memory, each parity bit is generated from the data bits of the byte it is associated with. This is done by a tree of exclusive-or gates. When the word is read back from the memory, the same parity computation is done on the data bits read from the memory, and the result is compared to the parity bits that were read. Any computed parity bit that doesn't match the stored parity bit indicates that there was at least one error in that byte (or in the parity bit itself). However, parity can only detect an odd number of errors. If an even number of errors occur, the computed parity will match the read parity, so the error will go undetected. Since memory errors are rare if the system is operating correctly, the vast majority of errors will be single-bit errors, and will be detected.

Unfortunately, while parity allows for the detection of single bit errors, it does not provide a means of determining which bit is in error, which would be necessary to correct the error. This is why parity is only an Error Detection Code (EDC).

ECC is an extension of the parity concept. ECC is usually performed only on complete words, rather than individual bytes. In a typical ECC system with a 64-bit data word, there would be 7 ECC bits. Each ECC bit is calculated as the parity of a different subset of the data bits. The key to the power of ECC is that each data bit contributes to more than one ECC bit. By making careful choices as to which data bits contribute to which ECC bits, it becomes possible to not just detect a single-bit error, but actually identify which bit is in error (even if it is one of the ECC bits). In fact, the code is usually designed so that single-bit errors can be corrected, and double-bit errors can be detected (but not corrected), hence the term Single Error Correction with Double Error Detection (SECDED).

When a word is written into ECC-protected memory, the ECC bits are computed by a set of exclusive-or trees. When the word is read back, the exclusive-OR trees use the data read from the memory to recompute the ECC. The recomputed ECC is compared to the ECC bits read from the memory. Any discrepancy indicates an error. By looking at which ECC bits don't match, it is possible to identify which data or ECC bit is in error, or whether a double-bit error occurred. In practice this comparison is done by an exclusive-or of the read and recomputed ECC bits. The result of this exclusive-or is called the syndrome. If the syndrome is zero, no error occurred. If the syndrome is non-zero, it can be used to index a table to determine which bits are in error, or that the error is uncorrectable. This table lookup stage is implemented in hardware in some systems, and via an interrupt, trap, or exception in others. In the latter case, the system software is responsible for correcting the error if possible. On the Alpha this is one of the functions of PALcode.
<A HREF="http://www.brouhaha.com/~eric/computers/ecc.html" target="_new">- By Eric Smith -</A> See his stats<A HREF="http://www.catsdogs.com/wwwesmith.html" target="_new"> here.</A>

I still do not have any statistical answers to ECC vs Parity vs Non-ECC Non- Parity memory types. The order listed is slowest to fastest. But it is hard to tell the difference to the human eye just like 219fps vs 220fps. Humans cannot differentiate between a fraction of a nanosecond.

Any more questions?

<b>"The events of my life are quite inconsequential.." - Dr. Evil</b> :lol:
 
G

Guest

Guest
Yeah two more, Is there a difference between Parity and Non-ECC?

And the ECS K7VTA3 doesn't support ECC ram does it?
 

FatBurger

Illustrious
Except that parity RAM (remember the EDO days) cannot fix the bad bit, just inform your computer that it is incorrect. ECC actually fixes it.

<font color=orange>Quarter</font color=orange> <font color=blue>Pounder</font color=blue> <font color=orange>Inside</font color=orange>
 

eden

Champion
Just what are the advantages of EDO RAM considering 64MB EDO costs even more than a 64MB RDRAM chip?

--
The other day I heard an explosion from the other side of town.... It was a 486 booting up...
 

bum_jcrules

Distinguished
May 12, 2001
2,186
0
19,780
Actually Parity is Parity and ECC is ECC. Parity is only an Odd / Even type of error checking memory type. Parity places a value on the incoming data. Say one byte is 00011, which is the character "A" in binary. If you add up the figures it will equal 2; 0+0+0+1+1=2. The number is even, it will then set a 1 for the check bit in parity, the total number of ones then equals 3 so it looks for an odd number. If the word, or "packet of data," comes out odd with a parity bit of 1 the total is an even number and it knows that this is an error and reports it. If there are multiple errors, parity might not know it. This is because if there are an even number of errors, say two errors where it sends 00000, it adds up to an even number and when it adds in the parity bit of 1 it is an odd number which is what it is looking for. It cannot read the error since it makes sense to the logic circuit (a.k.a. parity generator/checker) which is reading the string. So that error will go through and most likely cause a problem.

ECC actually verifies the correctness of each word, "packet of data." ECC uses 7 bits to verify a packet of 32 bits and 8 bits for 64 bit packets. (Hence there is better accuracy.) It can find a single mistake and correct it based on the idea of looking for a change in the stream the data. But it also cannot fix more than a single error in a word. If there are 2 or more errors it can only report an error. Multiple errors in a word/packet are rare but they do happen whereas single bit errors are more common.

<A HREF="http://www.pcguide.com/ref/ram/err_Parity.htm" target="_new">See this link for more details on how ECC and Parity work.</A>

<b>"The events of my life are quite inconsequential.." - Dr. Evil</b> :lol:
 

bum_jcrules

Distinguished
May 12, 2001
2,186
0
19,780
There are three types.

1. Non-Parity Non-ECC Memory
2. Parity Non-ECC Memory
3. Parity ECC Memory

I don't think that anyone manufactures the second type oon my list. It just doesen't make sense to anymore. Either you want to check and correct for errors or you don't want to. Why buy something that can't correct errors but might detect one.

Someone please correct me if I am in error here. But I think that one was able to buy Parity Non-ECC in the past. Fats, do you remember that?

As per your second question...

That board uses the Via Technologies KT266A Chipset with the VT8366A Northbridge. Both Non-Parity and ECC are supported by the ECS K7VTA3. <A HREF="http://www.ecs.com.tw/products/k7vta3_2x.htm" target="_new">See here for specs on that board.</A>

"In addition to offering the fastest DDR memory controller available, the VT8366A is also extremely flexible. Up to 4GB of DDR200 or DDR266 is supported, including ECC and Registered modules." <A HREF="http://www.via.com.tw/en/apollo/KT266A_WhitePaper.pdf" target="_new">- VIA Apollo KT266A Chipset White Paper, Page number 5 -</A>

<b>"The events of my life are quite inconsequential.." - Dr. Evil</b> :lol:
 

bum_jcrules

Distinguished
May 12, 2001
2,186
0
19,780
Not really. Normally Registered memory is buffered by one clock cycle to reduce the load on the memory controller.

"Buffered Memory
A buffer isolates the memory from the controller to minimize the load on the chip set. It is typically used when the system has a high density of memory and/or when a system has more than 3 memory module sockets." <A HREF="http://www.crucial.com/library/glossary.asp" target="_new">- From Crucial's Memory Terms Glossary -</A>

So in essence you can have ECC unbuffered and ECC buffered/registered modules. See <A HREF="http://www.crucial.com/store/listmodule.asp?module=SDRAM,+PC100&Package=168-pin+DIMM" target="_new">here</A> to see how crucial has it listed on their website. Notice the Unbuffered ECC memory.

<b>"The events of my life are quite inconsequential.." - Dr. Evil</b> :lol: <P ID="edit"><FONT SIZE=-1><EM>Edited by Bum_JCRules on 01/09/02 11:23 AM.</EM></FONT></P>