Thanks Patrick. See below. On Tue, Feb 23, 2010 at 1:19 PM, Patrick Hunt <ph...@apache.org> wrote: > Stack you might look at the following: > > 1) why does server 14 have such a low recv count? > > Received: 194 > > while the other servers are at 3.7k + received. Did server 14 fail at some > point? Or it's network? This may have caused the timeout seen by the client: >
Ok. Will check into this the next time. I did take the dump after the observed TIMED_OUT, a good while after. Could this be why the numbers are low? > ------snippet----- > 2010-02-21 18:23:55,583 [main-SendThread] INFO > org.apache.zookeeper.ClientCnxn: Attempting connection to server > 14.u.XXX.com/X.X.X.141:2181 > 2010-02-21 18:24:00,423 > [regionserver/208.76.44.140:60020.compactor-SendThread] WARN > org.apache.zookeeper.ClientCnxn: Exception closing session > 0x226ed968a270003 to sun.nio.ch.selectionkeyi...@2a50e9a3 > java.io.IOException: TIMED OUT > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906) > ----------- > > 2) connection timeout is different from session timeout. connection timeout > is the amount of time we allow for connection establishment (socket open) > until the server accepts the connection, this value is the session timeout > (as requested by the client) divided by the number of hosts in the host > list. This could account for why the timeout (above snippet) occurred after > 5 seconds. What timeout value is this client using? 15 seconds? > We ask for a session timeout of 60 seconds -- the hbase default -- and our ticktime is 3 seconds. You are not troubled at all by the exceptions closing sessions above? Are these just noise? Thanks for the input, St.Ack