For my initial testing I am running with a single ZooKeeper server, i.e. the
ensemble only has one server.  Not sure if this is exacerbating the problem?
 I will check out the trouble shooting link you sent me.

On Tue, Sep 1, 2009 at 5:01 PM, Patrick Hunt <ph...@apache.org> wrote:

> I'm not very familiar with ec2 environment, are you doing any monitoring?
> In particular network connectivity btw nodes? Sounds like networking issues
> btw nodes (I'm assuming you've also looked at stuff like this
> http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting and verified that
> you are not swapping (see gc pressure), etc...)
>
> Patrick
>
>
> Satish Bhatti wrote:
>
>> Session timeout is 30 seconds.
>>
>> On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt <ph...@apache.org> wrote:
>>
>>  What is your client timeout? It may be too low.
>>>
>>> also see this section on handling recoverable errors:
>>> http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
>>>
>>> connection loss in particular needs special care since:
>>> "When a ZooKeeper client loses a connection to the ZooKeeper server there
>>> may be some requests in flight; we don't know where they were in their
>>> flight at the time of the connection loss. "
>>>
>>> Patrick
>>>
>>>
>>> Satish Bhatti wrote:
>>>
>>>  I have recently started running on EC2 and am seeing quite a few
>>>> ConnectionLoss exceptions.  Should I just catch these and retry?  Since
>>>> I
>>>> assume that eventually, if the shit truly hits the fan, I will get a
>>>> SessionExpired?
>>>> Satish
>>>>
>>>> On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com>
>>>> wrote:
>>>>
>>>>  We have used EC2 quite a bit for ZK.
>>>>
>>>>> The basic lessons that I have learned include:
>>>>>
>>>>> a) EC2's biggest advantage after scaling and elasticity was conformity
>>>>> of
>>>>> configuration.  Since you are bringing machines up and down all the
>>>>> time,
>>>>> they begin to act more like programs and you wind up with boot scripts
>>>>> that
>>>>> give you a very predictable environment.  Nice.
>>>>>
>>>>> b) EC2 interconnect has a lot more going on than in a dedicated VLAN.
>>>>>  That
>>>>> can make the ZK servers appear a bit less connected.  You have to plan
>>>>> for
>>>>> ConnectionLoss events.
>>>>>
>>>>> c) for highest reliability, I switched to large instances.  On
>>>>> reflection,
>>>>> I
>>>>> think that was helpful, but less important than I thought at the time.
>>>>>
>>>>> d) increasing and decreasing cluster size is nearly painless and is
>>>>> easily
>>>>> scriptable.  To decrease, do a rolling update on the survivors to
>>>>> update
>>>>> their configuration.  Then take down the instance you want to lose.  To
>>>>> increase, do a rolling update starting with the new instances to update
>>>>> the
>>>>> configuration to include all of the machines.  The rolling update
>>>>> should
>>>>> bounce each ZK with several seconds between each bounce.  Rescaling the
>>>>> cluster takes less than a minute which makes it comparable to EC2
>>>>> instance
>>>>> boot time (about 30 seconds for the Alestic ubuntu instance that we
>>>>> used
>>>>> plus about 20 seconds for additional configuration).
>>>>>
>>>>> On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com>
>>>>> wrote:
>>>>>
>>>>>  Hello
>>>>>
>>>>>> I wanna set up a zookeeper ensemble on amazon's ec2 service. In my
>>>>>>
>>>>>>  system,
>>>>>
>>>>>  zookeeper is used to run a locking service and to generate unique
>>>>>> id's.
>>>>>> Currently, for testing purposes, I am only running one instance. Now,
>>>>>> I
>>>>>>
>>>>>>  need
>>>>>
>>>>>  to set up an ensemble to protect my system against crashes.
>>>>>> The ec2 services has some differences to a normal server farm. E.g.
>>>>>> the
>>>>>> data saved on the file system of an ec2 instance is lost if the
>>>>>> instance
>>>>>> crashes. In the documentation of zookeeper, I have read that zookeeper
>>>>>>
>>>>>>  saves
>>>>>
>>>>>  snapshots of the in-memory data in the file system. Is that needed for
>>>>>> recovery? Logically, it would be much easier for me if this is not the
>>>>>>
>>>>>>  case.
>>>>>
>>>>>  Additionally, ec2 brings the advantage that serves can be switch on
>>>>>> and
>>>>>>
>>>>>>  off
>>>>>
>>>>>  dynamically dependent on the load, traffic, etc. Can this advantage be
>>>>>> utilized for a zookeeper ensemble? Is it possible to add a zookeeper
>>>>>>
>>>>>>  server
>>>>>
>>>>>  dynamically to an ensemble? E.g. dependent on the in-memory load?
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>

Reply via email to