bob wrote:
> On Wed, 13 Aug 2008, paul wrote:
> 
>>  Shy extremely noisy hardware and/or literal hard failure, most
>>  errors will most likely always be expressed as 1 bit out of some
>>  very large N number of bits.
> 
> This claim ignores the fact that most computers today are still based 
> on synchronously clocked parallel bus hardware.  A common failure mode 
> is clock skew, which causes many bits to be wrong at once.  This can 
> even happen within the CPU.

- in my experience clock skew/drift problems will first manifest themselves
by expressing single bit errors even on parallel interfaces, as all although
all paths are logically parallel, the actual physical performance of each of
the individual transistor & traces composing the data path will be ever so
slightly different and although physical cad layout tools attempt to balance
clock trees, the actual arrival time of the clock to the latch elements of
the physical data-path implementation will also be slightly different (often
differing by as much as few picoseconds; therefore as a circuit approaches
its maximum frequency threshold (which depends on temperature, age, etc),
some very small number of single bit errors will begin to be generated, due
to setup/hold time violations being exceeded on the bit with the least
physical clock skew tolerance, as the clock frequency and/or temperature
(etc) increases, more and more bit paths will begin to fail, until the whole
path fails. Thereby as all systems have some of the bits within parallel paths
being more sensitive to one type of corruption or another, I tend to believe
that single bit failures will tend to express themselves statistically prior to
and in greater number than multi-bit failures even though hardware still
seems operable.

> As serial interfaces continue to be added to computers, the number of 
> single bit errors (vs multi-bit errors) would tend to increase except 
> for the fact that these serial interfaces are designed to detect and 
> discard erroroneous packets.
> 
> I do agree that the logic between the self-validating interfaces can 
> be faulty.
> 
> Bob
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to