On 04/01/2010 07:27 PM, li li wrote:
Dear developer,
I am just making research using zookeeper as a load
balancer.Recently,I plan to test the max load it can handle.But I have
some confuse about which I must consult to you .
Now I can handle about 300 clients with one server,when I set the
session time out is 300000000.
Whoa, that's way too large. Regardless the server is going to cap the
max timeout to 20*tickTime (so 40sec in the common case). The larger you
set the timeout value the longer it will take for your system to notice
failures. Typically you want a timeout btw 5 and 30 seconds. 5 means you
are more sensitive to failures, but it also means you are more sensitive
to transient network glitches. 30 it takes longer to notice when a
component has died (and therefore longer failover time) but you are much
less sensitive to network issues. Setting this depends on your
particular situation/architecture.
Please (re?)read this section on sessions, esp the paragraphs on how the
timeout works:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
It may also be that your client application is swapping or has long GC
pauses, see this:
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting
esp the section on "frequent client disconnects" and the section on "gc
pressure"
In your opinion , the session time out is set in which value more
suitable?
And in your experiments, how many clients per server can handle ?
what't more,I set the session time out is 30000000 which is a long
time.but when I run about 300 threads as clients,I get the err info as
follows.
I have one team that has 10000 client sessions connected to a single ZK
cluster, each session is using a 30second timeout. It works fine with
this load (group membership, master election, load balancing, sharding
information, etc... all stored in zk)
Also see this document:
http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
you can see that the server is handling quite a load with minimal
increase in latency (even with 1 cpu). I've pushed this to over 400
clients with 4million znodes and 20million watches and it worked fine
(4cpus in that case and 8gig of heap).
If I were you i'd look at swap and gc on clients and server, ensure that
this is not an issue.
Good luck,
Patrick
********************************************************************************************************************************************************
2010-04-02 10:23:59,437 - WARN [main-SendThread:clientcnxn$sendthr...@967]
- Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@46604660
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:573)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933)
**************************************************************************************************************************************************
I already set the maxClientCnxns=0.
Thanks for your reply ,and I am looking forward the further answer .
with best wishes!