Patrick Hunt commented on ZOOKEEPER-344:

Bryan, that's good info. It doesn't sound like zk server latency is the issue 
then, you have an excess
of cpu/memory based on the tests you are running, however it will be good to 
verify using jmx or the 
stat command.

If you can run with DEBUG logging enabled (server and client) it might give you 
more insight. Also running
at DEBUG level will cause the stack of the "read error" you are seeing to be 
printed to the server log (zk
version 3.1). If you can share all/part of the logs please feel free to attach 
them to this JIRA.

It's probably this code in server doIO though that's causing the server side 
"read error" exception you are seeing:

                int rc = sock.read(incomingBuffer);
                if (rc < 0) {
                    throw new IOException("Read error");

read returns "The number of bytes read, possibly zero, or -1 if the channel has 
reached end-of-stream"

this indicates to me that the client has closed the connection.

Also, looking at your logs the client log is from 13:35 while the server log is 
from 13:06, assuming that the 
clocks are even fairly close this is almost 30min difference, if true it's 
unlikely the events are correlated?

My guess is that the client is closing the connection for some reason, but it 
would be interesting to see
the debug logs (with clocks that are fairly close on server/client so it would 
be easier to correlate the log

Hope this helps.

> doIO in NioServerCnxn: Exception causing close of session : cause is "read 
> error"
> ---------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-344
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: java client, server
>    Affects Versions: 3.1.0
>         Environment: jdk1.6.0_07
> Linux blade2 #1 SMP Mon Dec 1 22:21:35 EST 2008 
> x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: bryan thompson
>             Fix For: 3.2.0
> I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I 
> see a lot of expired sessions.  I am using a 16 node cluster which is all on 
> the same local network.  There is a single zookeeper instance (these are 
> benchmarking runs).
> The problem appears to be correlated with either run time or system load.\
> Personally I think that it is system load because I have session session 
> expired events under a Windows platform running zookeeper and the application 
> (i.e., everthing is local) when the application load suddenly spikes.  To me 
> this suggests that the client is not able to renew (ping) the zookeeper 
> service in a timely manner and is expired.  But the log messages below with 
> the "read error" suggest that maybe there is something else going on?
> Zookeeper Configuration
> #Wed Mar 18 12:41:05 GMT-05:00 2009
> clientPort=2181
> dataDir=/var/bigdata/benchmark/zookeeper/1
> syncLimit=2
> dataLogDir=/var/bigdata/benchmark/zookeeper/1
> tickTime=2000
> Some representative log messages are below.
> Client side messages (from our app)
> ERROR [main-EventThread] 
> com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
> 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
> New state: Expired : 
> zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode
> ERROR [main-EventThread] 
> com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
> 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
> New state: Expired : 
> zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode
> Server side messages:
>  WARN [NIOServerCxn.Factory:2181] 
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
> 2009-03-18 13:06:57,252 - Exception causing close of session 
> 0x1201aac14300022 due to java.io.IOException: Read error
>  WARN [NIOServerCxn.Factory:2181] 
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
> 2009-03-18 13:06:58,198 - Exception causing close of session 
> 0x1201aac1430000f due to java.io.IOException: Read error

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to