On Mon, 12 Apr 2010 16:48:24 -0700
Patrick Hunt <ph...@apache.org> wrote:
> Any idea why the connection is flapping so badly? Is this
> client->server connection remote or in colo? (not that than should
> effect the operations of the server...)
The clients are all over the world. I have three servers, one in the
US, one in Germany, and one in South Korea. Clients are connecting
from North/South America to the US server, from Europe to the German
serer, and from Asia/Australia to the Korean server.
This is all happening on PlanetLab, which is sometimes heavily
oversubscribed. In short, any number of bad things could be happening
that cause us to lose connectivity.
From your previous message:
> What's the ping time btw colos? 2sec tickTime and esp the initLimit
> and syncLimit are pretty low. You are allowing for only 4 seconds to
> d/l the data repository to a remote server. Even in-colo we typically
> use a higher value... but you many not want to change until we can
> reproduce this. You probably want a 4 sec tickTime and 60/40sec (so
> settings of 15/10) for the init/sync limits (something like that,
> depending on latencies/bandwidth you see)
Interesting, I thought I was using the default config parameters with
only a modified data directory and my own hostnames, but I see now
that that defaults are larger. Those values should certainly be larger
for the environment I'm running in. I'll leave them as they are for
now to see if we can reproduce the problem, though I'll eventually need
to fix them as my deadline approaches. :)
> Probably reaching for straws but could you print "path", just to
> confirm it's what you know it is?
Sure, I can do this. I only have a single top-level znode though, so I
don't think this is the problem, but it can't hurt to double check.