Hi everyone!

I am writing to this group because recently we are getting some
strange errors with our production zookeeper setup.

>From time to time we are observing that our client application (C++
based) disconnects from zookeeper (session state is changed to 1) and
reconnects (state changed to 3).
This itself is not a problem - usually application continues to run
without problems after reconnect.
But from time to time after above happens all subsequent operations
start to return ZSESSIONMOVED error. To make it work again we have to
restart application (which creates new zookeeper session).

I noticed that in 3.2.0 introduced a bug
http://issues.apache.org/jira/browse/ZOOKEEPER-449 but we are using
zookeeper v. 3.2.2.
I just noticed that app at compile time used 3.2.0 library but patches
fixing bug 449 did not touch C client lib so I believe that our
problems are not
related with that.

In zookeeper logs at moment which initiated the problem with client
application I have

node1:
2010-03-16 14:21:43,510 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@607] - Connected to
/10.1.112.61:37197 lastZxid 42992576502
2010-03-16 14:21:43,510 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@636] - Renewing session
0x324dcc1ba580085
2010-03-16 14:21:49,443 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:nioserverc...@992] - Finished init
of 0x324dcc1ba580085 valid:true
2010-03-16 14:21:49,443 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@518] - Exception causing
close of session 0x324dcc1ba580085 due to java.io.IOException: Read
error
2010-03-16 14:21:49,444 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@857] - closing
session:0x324dcc1ba580085 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.1.112.62:2181
remote=/10.1.112.61:37197]

node2:
2010-03-16 14:21:40,580 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@494] - Exception causing
close of session 0x324dcc1ba580085 due to java.io.IOException: Read
error
2010-03-16 14:21:40,581 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@833] - closing
session:0x324dcc1ba580085 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.1.112.63:2181
remote=/10.1.112.61:60693]
2010-03-16 14:21:46,839 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@583] - Connected to
/10.1.112.61:48336 lastZxid 42992576502
2010-03-16 14:21:46,839 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@612] - Renewing session
0x324dcc1ba580085
2010-03-16 14:21:49,439 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:nioserverc...@964] - Finished init
of 0x324dcc1ba580085 valid:true

node3:
2010-03-16 02:14:48,961 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@494] - Exception causing
close of session 0x324dcc1ba580085 due to java.io.IOException: Read
error
2010-03-16 02:14:48,962 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@833] - closing
session:0x324dcc1ba580085 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/10.1.112.64:2181
remote=/10.1.112.61:57309]

and then lots of entries like this
2010-03-16 02:14:54,696 - WARN
[ProcessThread:-1:preprequestproces...@402] - Got exception when
processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9e9e49
zxid:0xfffffffffffffffe txntype:unknown
/locks/9871253/lock-8589943989-
org.apache.zookeeper.KeeperException$SessionMovedException:
KeeperErrorCode = Session moved
        at 
org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231)
        at 
org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211)
        at 
org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)
2010-03-16 14:22:06,428 - WARN
[ProcessThread:-1:preprequestproces...@402] - Got exception when
processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9f6603
zxid:0xfffffffffffffffe txntype:unknown
/locks/1665960/lock-8589961006-
org.apache.zookeeper.KeeperException$SessionMovedException:
KeeperErrorCode = Session moved
        at 
org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231)
        at 
org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211)
        at 
org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)


To workaround disconnections I am going to increase session timeout
from 5 to 15 seconds but event if it helps at all it is just a
workaround.

Do you have an idea where is the source of my problem.

Regards, Łukasz Osipiuk



-- 
-- 
Łukasz Osipiuk
mailto:luk...@osipiuk.net

Reply via email to