[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest

Flavio Paiva Junqueira (JIRA) Fri, 03 Apr 2009 09:06:38 -0700

     [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Flavio Paiva Junqueira updated ZOOKEEPER-362:
---------------------------------------------

    Attachment: ZOOKEEPER-362.patch

Thanks, Ben. I've fixed the log calls in this new patch.

> Issues with FLENewEpochTest
> ---------------------------
>
>                 Key: ZOOKEEPER-362
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.1.1
>            Reporter: Flavio Paiva Junqueira
>             Fix For: 3.2.0
>
>         Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch
>
>
> I have been able to identify two reasons that cause FLENewEpochTest to fail:
> 1- There is a race condition that is triggered when two peers try to 
> establish a connection to each other for leader election. Basically, if they 
> start roughly at the same time, the server with highest id will try to open 
> two connections. The two competing connections will lead to one notification 
> message to be lost. This message happens to be critical for this two process 
> scenario; 
> 2- The code to shut down a peer is not working well with the unit tests. For 
> this particular unit test, we need to be able to shut down a peer completely 
> to check the situation the test tries to reproduce. However, it seems that in 
> some runs timing causes the other peers to believe it is still alive, and end 
> up electing it. This peer, however, eventually shuts down and leader election 
> fails.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest

Reply via email to