Hi Nick, Your assessment sounds correct, the issue seems to be caused by the bug described in ZOOKEEPER-427. Can't you upgrade to a newer release? Killing the leader should do it, but the bug will still be there, so I recommend upgrading.

Thanks,
-Flavio

On Jan 12, 2010, at 10:52 PM, Nick Bailey wrote:

We are running zookeeper 3.1.0

Recently we noticed the cpu usage on our machines becoming increasingly high and we believe the cause is

https://issues.apache.org/jira/browse/ZOOKEEPER-427

However our solution when we noticed the problem was to kill the zookeeper process and restart it.

After doing that though it looks like the newly restarted zookeeper server is continually attempting to elect a leader even though one already exists.

The process responses with 'imok' when asked, but the stat command returns 'ZooKeeperServer not running'.

I belive that killing the current leader should trigger all servers to do an election and solve the problem, but I'm not sure. Should that be the course of action in this situation?

Also we have 12 servers, but 5 are currently not running according to stat. So I guess this isn't a problem unless we lose another one. We have plans to upgrade zookeeper to solve the cpu issue but haven't been able to do that yet.

Any help appreciated,
Nick Bailey


Reply via email to