I will take a look at all of the above and report back to you. I have
Friday off so should have some time to work on it this weekend.
Todd
Vadim Gritsenko wrote:
Todd Byrne wrote:
Well this was an EOF exception so the database file was truncated, is
there an away to recover the non corrupted documents?
I was thinking a bit about it.
It is relatively simple to add a basic sanity check into the
filer.open() method to see if file was properly closed or not: length
of the file should be exactly header + pageCount * pageSize. So it is
easy to catch EOF condition.
In addition to this check, filer probably can write additional byte
into the header to indicate clean shutdown. It can write 1 when file
is opened, and 0 right before it closed. So next time you open, if you
see '1' in there, it means database was not properly shut down last time.
Once you know if file was shut down incorrectly, we could try recover
data from it (starting with taking a backup, probably?)
For hash filer, it can walk through main table and collect all non
empty records, and recover collision chains. BTree filer also can be
traversed. So most of the information can be retrievable. Some, of
course, could be lost.
Let me know if you want to implement any of the above.
Vadim
Simple scan through the collection until the error occurs again?
I am going to work on some changes to propagate the IOException into the
other methods.
Todd
Vadim Gritsenko wrote:
Todd Byrne wrote:
In getBTreeNode(long page, BTreeNode parent)(line ) if an exception
gets
thrown for what ever reason it just gets ignored and null returned.
This
seems problematic because nothing that calls this method checks for
null. Worse seems getChildNode(int idx)(line 534) doesn't check and is
called in about 11 places and none of their callers check for null.
My hunch is to let the exception be thrown and then we will get
meaning
full stack traces instead of NPEs later and have to search the log
files
later. If an exception was thrown like PageNotFound we could catch
it in
the result code and gracefully skip the document.
Thoughts Ideas?
The only exception which should be thrown from this method is
IOException, as far as I can see. And I agree, I think it is better to
get IOException rather then rather pointless NPE.
So for a quick fix, I'd try and change method signature(s) to include
IOException and pass it up instead of eating it.
It brings though a point of how xindice should handle exceptional
situations like this. If it is at all possible, it should recover.
E.g.,
it is better to be able to extract some documents from damaged
collection then none at all.
Vadim