Re: zookeeper on ec2

Patrick Hunt Tue, 01 Sep 2009 17:05:54 -0700

Yes. create/set/delete/... are really the issue (non-idempotent).


Satish Bhatti wrote:

Well a bunch of the ConnectionLosses were for zookeeper.exists() calls.  I'm
pretty sure dumb retry for those should suffice!

On Tue, Sep 1, 2009 at 4:31 PM, Mahadev Konar <maha...@yahoo-inc.com> wrote:

Hi Satish,

 Connectionloss is a little trickier than just retrying blindly. Please
read the following sections on this -

http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling

And the programmers guide:

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperProgrammers.html

To learn more about how to handle CONNECTIONLOSS. The idea is that that
blindly retrying would create problems with CONNECTIONLOSS, since a
CONNECTIONLOSS does NOT necessarily mean that the zookepeer operation that
you were executing failed to execute. It might be possible that this
operation went through the servers.

Since, this has been a constant source of confusion for everyone who starts
using zookeeper we are working on a fix ZOOKEEPER-22 which will take care
of
this problem and programmers would not have to worry about CONNECTIONLOSS
handling.

Thanks
mahadev




On 9/1/09 4:13 PM, "Satish Bhatti" <cthd2...@gmail.com> wrote:

I have recently started running on EC2 and am seeing quite a few
ConnectionLoss exceptions.  Should I just catch these and retry?  Since I
assume that eventually, if the shit truly hits the fan, I will get a
SessionExpired?
Satish

On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com>

wrote:

We have used EC2 quite a bit for ZK.

The basic lessons that I have learned include:

a) EC2's biggest advantage after scaling and elasticity was conformity

of

configuration.  Since you are bringing machines up and down all the

time,

they begin to act more like programs and you wind up with boot scripts

that

give you a very predictable environment.  Nice.

b) EC2 interconnect has a lot more going on than in a dedicated VLAN.

 That

can make the ZK servers appear a bit less connected.  You have to plan

for

ConnectionLoss events.

c) for highest reliability, I switched to large instances.  On

reflection,

I
think that was helpful, but less important than I thought at the time.

d) increasing and decreasing cluster size is nearly painless and is

easily

scriptable.  To decrease, do a rolling update on the survivors to update
their configuration.  Then take down the instance you want to lose.  To
increase, do a rolling update starting with the new instances to update

the

configuration to include all of the machines.  The rolling update should
bounce each ZK with several seconds between each bounce.  Rescaling the
cluster takes less than a minute which makes it comparable to EC2

instance

boot time (about 30 seconds for the Alestic ubuntu instance that we used
plus about 20 seconds for additional configuration).

On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com>

wrote:

Hello

I wanna set up a zookeeper ensemble on amazon's ec2 service. In my

system,

zookeeper is used to run a locking service and to generate unique id's.
Currently, for testing purposes, I am only running one instance. Now, I

need

to set up an ensemble to protect my system against crashes.
The ec2 services has some differences to a normal server farm. E.g. the
data saved on the file system of an ec2 instance is lost if the

instance

crashes. In the documentation of zookeeper, I have read that zookeeper

saves

snapshots of the in-memory data in the file system. Is that needed for
recovery? Logically, it would be much easier for me if this is not the

case.

Additionally, ec2 brings the advantage that serves can be switch on and

off

dynamically dependent on the load, traffic, etc. Can this advantage be
utilized for a zookeeper ensemble? Is it possible to add a zookeeper

server

dynamically to an ensemble? E.g. dependent on the in-memory load?

David

Re: zookeeper on ec2

Reply via email to