Please - need help with fault tolerance issues...

Jim Wissner 18 Mar 2003 03:21:51 -0000


Hello,

I made a post about fault tolerance in BTree and its subclasses, but didn't get any replies. Since then I've done a fair bit of research into these classes.

There are points of vulnerability within the code that make it corruptible in repeatable ways. Granted, it is under stressful circumstances, but for any degree of fault tolerance they must be taken into account.

The two primary scenarios I have been replicating are (1) IO error to due inadequate space to grow/write to disk (either disk full or quota reached), and (2) abnormal termination mid-write.

Forcing these scenarios typically renders the entire file unreadable, and results in total data loss (not counting manual data rescue).

What I'm trying to do is figure out a strategy for making the files tolerant to such faults, and self-healing. As I said before, I very very very much welcome any ideas and hope that someone will voice their opinion, since most of you are much more well versed with the code than myself and must have thought of these issues.

Working from a page level, I think it is possible to make pages safe from the above problems by (in general) enlarging the file header by the page size + room for status and offset bytes, and then prior to each page write, copy there the offset and existing contents of the page to be written onto, and then update the status as "ok" only after the page is successfully written. This has the negative side effect of decreasing write speed. However this may be ok - there are many applications that would happily trade an acceptable percentage of write speed in exchange for fault tolerance. Read speed would be unaffected. If a crash occurs, on startup it is possible to reconstruct the page as it was before the failed write attempt occurred.

The question is, what does this mean for tree nodes (and values in the case of BTreeFiler) that may be written across multiple pages? The only solution I have thought of so far is to use some kind of checksumming method, in which a checksum precedes a value, and thus can detect problems upon reading. Setting aside the obvious performance issues of this, I'm not even sure if it would solve BTreeNodes that span multiple pages.

So my question is: is it even *possible* to retain tree integrity by ensuring page-level integrity?

Any answers/opinions/other ideas are GREATLY encouraged!! I need the help of you experts!!

Also - are there any graphical docs illustrating the structure of the core files as they are maintained? (that is, the BTree file and its components - file header/pages/nodes/etc).

Thanks,
Jim


--
[EMAIL PROTECTED]

Visit www.jbrix.org for:
  + SpeedJAVA jEdit Code Completion Plugin
  + Xybrix XML Application Framework
  + other great Open Source Software

Please - need help with fault tolerance issues...

Reply via email to