[
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739609#action_12739609
]
Patrick Hunt commented on ZOOKEEPER-498:
----------------------------------------
Looks to me like 0 weight is still busted, fle0weighttest is actually failing
on my machine, however it's reported as success:
------------- Standard Error -----------------
Exception in thread "Thread-108" junit.framework.AssertionFailedError: Elected
zero-weight server
at junit.framework.Assert.fail(Assert.java:47)
at
org.apache.zookeeper.test.FLEZeroWeightTest$LEThread.run(FLEZeroWeightTest.java:138)
------------- ---------------- ---------------
this is probably due because the test is calling assert in a thread other than
the main test thread - which junit will not track/knowabout.
One problem I see with these tests (0weight test I looked at) -- it doesn't
have a client attempt to connect to the various servers as part of declaring
success. Really we should only consider "success"ful test (ie assert that) if a
client can connect to each server in the cluster and change/seechanges. As part
of fixing this we really need to do a sanity check by testing the various
command lines and checking that a client can connect.
I'm not even sure FLEnewepochtest/fletest/etc... are passing either. new epoch
seems to just thrash...
Also I tried 3 & 5 server quorums "by hand from the command line" with 0 weight
and they see similar issues to what Todd is seeing.
this is happening for me on both the trunk and 3.2 branch source.
> Unending Leader Elections : WAN configuration
> ---------------------------------------------
>
> Key: ZOOKEEPER-498
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
> Project: Zookeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.2.0
> Environment: Each machine:
> CentOS 5.2 64-bit
> 2GB ram
> java version "1.6.0_13"
> Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed
> Network Topology:
> DC : central data center
> POD(N): remote data center
> Zookeeper Topology:
> Leaders may be elected only in DC (weight = 1)
> Only followers are elected in PODS (weight = 0)
> Reporter: Todd Greenwood-Geer
> Assignee: Patrick Hunt
> Priority: Critical
> Fix For: 3.2.1, 3.3.0
>
> Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, zoo.cfg
>
>
> In a WAN configuration, ZooKeeper is endlessly electing, terminating, and
> re-electing a ZooKeeper leader. The WAN configuration involves two groups, a
> central DC group of ZK servers that have a voting weight = 1, and a group of
> servers in remote pods with a voting weight of 0.
> What we expect to see is leaders elected only in the DC, and the pods to
> contain only followers. What we are seeing is a continuous cycling of
> leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended
> patches (473, 479, 481, 491), and now release 3.2.1.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.