On 04/01/2010 07:27 PM, li li wrote:
Dear developer,
    I am just making research using zookeeper as a load
balancer.Recently,I plan to test the max load it can handle.But I have
some confuse about which I must consult to you .
     Now I can handle about 300 clients with one server,when I set the
session time out is 300000000.

Whoa, that's way too large. Regardless the server is going to cap the max timeout to 20*tickTime (so 40sec in the common case). The larger you set the timeout value the longer it will take for your system to notice failures. Typically you want a timeout btw 5 and 30 seconds. 5 means you are more sensitive to failures, but it also means you are more sensitive to transient network glitches. 30 it takes longer to notice when a component has died (and therefore longer failover time) but you are much less sensitive to network issues. Setting this depends on your particular situation/architecture.

Please (re?)read this section on sessions, esp the paragraphs on how the timeout works:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions

It may also be that your client application is swapping or has long GC pauses, see this:
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting
esp the section on "frequent client disconnects" and the section on "gc pressure"

     In your opinion , the session time out is set in which value more
suitable?
     And in your experiments, how many clients per server can handle ?
     what't more,I set the session time out is 30000000 which is a long
time.but when I run about 300 threads as clients,I get the  err info as
follows.

I have one team that has 10000 client sessions connected to a single ZK cluster, each session is using a 30second timeout. It works fine with this load (group membership, master election, load balancing, sharding information, etc... all stored in zk)

Also see this document:
http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
you can see that the server is handling quite a load with minimal increase in latency (even with 1 cpu). I've pushed this to over 400 clients with 4million znodes and 20million watches and it worked fine (4cpus in that case and 8gig of heap).

If I were you i'd look at swap and gc on clients and server, ensure that this is not an issue.

Good luck,

Patrick


********************************************************************************************************************************************************
   2010-04-02 10:23:59,437 - WARN  [main-SendThread:clientcnxn$sendthr...@967]
- Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@46604660
java.net.ConnectException: Connection refused: no further information
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:573)
  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
  
**************************************************************************************************************************************************
   I already set the maxClientCnxns=0.
    Thanks for your reply ,and I am looking forward the further answer .
      with best wishes!

Reply via email to