[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-503:
------------------------------------

    Attachment: ZOOKEEPER-503.patch

this patch fixes a range of projects. it is a big simplification. it has a net 
removal of 700 lines of code. the meta data for a ledger was collapsed into a 
single znode. here is a description of the changes:

Index calculation in QuorumEngine must be synchronized on the LedgerHandle to 
avoid changes to the ensemble while trying to submit an operation. Such changes 
happen upon crashes of bookies.                                                 
                                  

I initialized thought it was not necessary, but now I think this 
synchronization block is necessary. 

If a writer adds just a few entries to a ledger, it may end up with hints that 
say "empty ledger" when trying to recover a ledger. In this case, if we receive 
an empty ledger flag as a hint, we have to switch the hint to zero, which means 
that the client will start recovery from entry zero. If no entry has been 
written, it still works as the client won't be able to read anything.           
               

I have changed LedgerRecoveryTest to test for: many entries written, one entry 
written, no entry written.

I have been able to identify the problem that was causing BookieFailureTest to 
hang on Utkarsh's computer. Basically, when the queue of a BookieHandle is full 
and the corresponding bookie has crashed, we are not able to add a read 
operation to the queue incoming queue of the bookie handle because the 
BookieHandle is not processing new requests anymore and it is waiting to fail 
the handle. In this case, the BookieHandle throws an exception after timing out 
the call to add the read operation to the queue. We were propagating this 
exception to the application.                                                   
                                                                  

The main problem is that we have to add the operation to the queue of 
ClientCBWorker so that we guarantee that it knows about the operation once we 
receive responses from bookies. If we throw an exception without removing the 
operation from the ClientCBWorker queue, the worker will wait forever, which I 
believe is the case Utkarsh was observing.                                      
                       

If I reasoned about the code correctly, then my modifications fix this problem 
by retrying a few times and erroring out after a number of retries. Erroring 
out in this case means notifying the CBWorker so that we can release the 
operation.                                 

Fixing log level in LedgerConfig. -F

I have mainly worked on the ledger recovery machinery. I made it asynchronous 
by transforming LedgerRecovery into a thread and moving some calls. We have to 
revisit this way of making it asynchronous as it might not be acceptable for 
this patch.

I'm still to check why BookieFailureTest is failing for Utkarsh. It passes fine 
every time for me, so we have to find a way to reproduce it reliably in my 
machine so that I can debug it.


Took a pass over asynchronous ledger operations: create, open, close. Some 
parts are still blocking, work on those next.

> race condition in asynchronous create
> -------------------------------------
>
>                 Key: ZOOKEEPER-503
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-503
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: contrib-bookkeeper
>            Reporter: Benjamin Reed
>         Attachments: ZOOKEEPER-503.patch
>
>
> there is a race condition between the zookeeper completion thread and the 
> bookeeper processing queue during create. if the zookeeper completion thread 
> falls behind due to scheduling, the action counter of the create operation 
> may go backwards.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to