Re: user cousult

Patrick Hunt Thu, 01 Apr 2010 21:51:53 -0700

On 04/01/2010 07:27 PM, li li wrote:

Dear developer,
    I am just making research using zookeeper as a load
balancer.Recently,I plan to test the max load it can handle.But I have
some confuse about which I must consult to you .
     Now I can handle about 300 clients with one server,when I set the
session time out is 300000000.

Whoa, that's way too large. Regardless the server is going to cap themax timeout to 20*tickTime (so 40sec in the common case). The larger youset the timeout value the longer it will take for your system to noticefailures. Typically you want a timeout btw 5 and 30 seconds. 5 means youare more sensitive to failures, but it also means you are more sensitiveto transient network glitches. 30 it takes longer to notice when acomponent has died (and therefore longer failover time) but you are muchless sensitive to network issues. Setting this depends on yourparticular situation/architecture.

Please (re?)read this section on sessions, esp the paragraphs on how thetimeout works:

http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions

It may also be that your client application is swapping or has long GCpauses, see this:

http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting

esp the section on "frequent client disconnects" and the section on "gcpressure"

     In your opinion , the session time out is set in which value more
suitable?
     And in your experiments, how many clients per server can handle ?
     what't more,I set the session time out is 30000000 which is a long
time.but when I run about 300 threads as clients,I get the  err info as
follows.

I have one team that has 10000 client sessions connected to a single ZKcluster, each session is using a 30second timeout. It works fine withthis load (group membership, master election, load balancing, shardinginformation, etc... all stored in zk)


Also see this document:
http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

you can see that the server is handling quite a load with minimalincrease in latency (even with 1 cpu). I've pushed this to over 400clients with 4million znodes and 20million watches and it worked fine(4cpus in that case and 8gig of heap).

If I were you i'd look at swap and gc on clients and server, ensure thatthis is not an issue.


Good luck,

Patrick


********************************************************************************************************************************************************
   2010-04-02 10:23:59,437 - WARN  [main-SendThread:clientcnxn$sendthr...@967]
- Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@46604660
java.net.ConnectException: Connection refused: no further information
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:573)
  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
  
**************************************************************************************************************************************************
   I already set the maxClientCnxns=0.
    Thanks for your reply ,and I am looking forward the further answer .
      with best wishes!

Re: user cousult

Reply via email to