Hi. I'm trying to upgrade a zookeeper cluster from 3.2.1 to 3.3.0, and having problems. I can't get a 3.3.0 node to successfully join the cluster and stay joined.
If I run zkServer.sh status immediately after starting up the newly upgraded node, it says the service is probably not running, and shows me this: [char...@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:47:35,574 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@251] - Accepted socket connection from /127.0.0.1:40287 2010-04-14 22:47:35,576 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@968] - Processing stat command from /127.0.0.1:40287 2010-04-14 22:47:35,577 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@606] - EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket 2010-04-14 22:47:35,578 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1286] - Closed socket connection for client /127.0.0.1:40287 (no session established for client) Error contacting service. It is probably not running. [char...@test-zookeeper001 zookeeper-current]$ 2010-04-14 22:47:35,580 - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1310] - ignoring exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262) If I connect with zkCli.sh, I can list the contents of zookeeper. If I make changes to the schema on either of the other two nodes, test-zookeeper002 and test-zookeeper003, both of which are running 3.2.1, the changes are reflected on test-zookeeper001, which is running 3.3.0. When I exit zkCli.sh, however, zkServer.sh status starts flapping between "Error contacting service. It is probably not running." and "Mode: follower", as you can see below. Any ideas? I'd really rather not have to take the production zookeeper cluster down to upgrade if it's not necessary. Thanks, Charity. [char...@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:53:16,848 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@251] - Accepted socket connection from /127.0.0.1:55284 2010-04-14 22:53:16,849 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@968] - Processing stat command from /127.0.0.1:55284 2010-04-14 22:53:16,849 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@606] - EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket 2010-04-14 22:53:16,850 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1286] - Closed socket connection for client /127.0.0.1:55284 (no session established for client) Error contacting service. It is probably not running. 2010-04-14 22:53:16,850 - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1310] - ignoring exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262) [char...@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:53:18,908 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@251] - Accepted socket connection from /127.0.0.1:55285 2010-04-14 22:53:18,909 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@968] - Processing stat command from /127.0.0.1:55285 2010-04-14 22:53:18,909 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@606] - EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket 2010-04-14 22:53:18,910 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1286] - Closed socket connection for client /127.0.0.1:55285 (no session established for client) 2010-04-14 22:53:18,910 - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1310] - ignoring exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262) 2010-04-14 22:53:18,911 - ERROR [Thread-13:nioservercnxn$factor...@82] - Thread Thread[Thread-13,5,main] died java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:64) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.wakeup(NIOServerCnxn.java:927) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.checkFlush(NIOServerCnxn.java:909) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.flush(NIOServerCnxn.java:945) at java.io.BufferedWriter.flush(BufferedWriter.java:236) at java.io.PrintWriter.flush(PrintWriter.java:276) at org.apache.zookeeper.server.NIOServerCnxn$2.run(NIOServerCnxn.java:1089) Mode: follower