Although I don't know for sure that most such errors are in fact single bit in nature, I can only surmise they most likely statistically are absent detection otherwise; as with the exception of error corrected memory systems and/or check-summed communication channels, each transition of data between hardware interfaces at ever increasing clock clock rates, correspondingly increase the probability of such otherwise non-detectable soft single bit error being injected at these boundaries, where although the probabilities of their occurrence are small enough not to be easily detectable or classifiable as a hardware failure, they none the less can occur with a high enough probability that over the course of days/weeks/years and trillions of bits they will be observable and should be expected and planed for within reason.
Utilizing a strong error correcting code in combination with or in lieu of a strong hash code would seem like a good thing to more strongly warrant that data's representation in memory at the time of it's computation is more resilient to transmission and subsequent retrieval; but suspect through time as technology continues to push clock rates and corresponding data pool size ever higher, that some form of uniform data integrity mechanism will need to be incorporated within all the processing and communications interface data paths within systems in order to improve data's resilience to transmission and processing errors albeit being statistically very small for any single bit. > Anton B. Rang wrote: > > That brings up another interesting idea. > > > > ZFS currently uses a 128-bit checksum for blocks of > up to 1048576 bits. > > > > If 20-odd bits of that were a Hamming code, you'd > have something slightly stronger than SECDED, and ZFS > could correct any single-bit errors encountered. > > > > Yes. But I'm not convinced that we will see single > bit errors, since > there is already a large number of single-bit-error > detection and (often) > correction capability in modern systems. It seems > that when we lose > a block of data, we lose more than a single bit. > > It should be relatively easy to add code to the > current protection schemes > which will compare a bad block to a reconstructed, > good block and > deliver this information for us. I'll add an RFE. > -- richard > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss