Hi Ted,

I haven't give it a serious thought yet, but I don't think it is neccessary
for the cluster to keep track of time.

A node can make its own decision. For the sake of argument, lets say that we
have a client and a server with following policy:
1. Client is supposed to send a ping to server every 1 sec.
2. If server does not hear from client for 5 seconds, then the server
declares that the client is dead.
3. Similary if the client cannot communicate with the server for 5 seconds
client declares that the server is dead.

If the client receives a timeout (say while doing some IO) because of a time
jump, it should check the number of pings that has failed with the server.
If the number is 5, then this is a true failure, If the number is less than
5, then this is because of a time drift.

At the server side, the server can attempt to reconnect (or send a ping to
the client) after it receives a timeout. Thus, if the timeout occured
because of time drift, the server will reconnect and continue. We should
ofcourse have an upper bound in number of retries, etc.

For ZK, it is important to handle time jumps on ZK leader.

>
> I believe that the pattern of these problems is a slow slippage behind and
> a
> sudden jump forward.
>


You won't see the slippage. You will mainly see a jump forward. Note with
large enough number of nodes, multiple nodes could see their time jumping
forward. Therefore, checking comparing time between two servers may not
help.


>
> On Thu, Aug 19, 2010 at 7:51 AM, Vishal K <vishalm...@gmail.com> wrote:
>
> > Hi,
> >
> > I remember Ben had opened a jira for clock jumps earlier:
> > https://issues.apache.org/jira/browse/ZOOKEEPER-366. It is not uncommon
> to
> > have clocks jump forward in virtualized environments.
> >
> > It is desirable to modify ZooKeeper to handle this situation (as much as
> > possible) internally. It would need to be done for both client - server
> > connections and server - server connections. One obvious solution is to
> > retry a few times (send ping) after getting a timeout. Another way is to
> > count the number of pings that have been sent after receiving the
> timeout.
> > If number of pings do not match the expected number (say 5 ping attempt
> > should be finished for a 5 sec timeout), then wait till all the pings are
> > finished. In effect do not completely rely on the clock. Any comments?
> >
> > -Vishal
> >
> > On Thu, Aug 19, 2010 at 3:52 AM, Qing Yan <qing...@gmail.com> wrote:
> >
> > > Oh.. our servers are also running in a virtualized environment.
> > >
> > > On Thu, Aug 19, 2010 at 2:58 PM, Martin Waite <waite....@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I have tripped over similar problems testing Red Hat Cluster in
> > > virtualised
> > > > environments.  I don't know whether recent linux kernels have
> improved
> > > > their
> > > > interaction with VMWare, but in our environments clock drift caused
> by
> > > lost
> > > > ticks can be substantial, requiring NTP to sometimes jump the clock
> > > rather
> > > > than control acceleration.   In one of our internal production rigs,
> > the
> > > > local NTP servers themselves were virtualised - causing absolute
> mayhem
> > > > when
> > > > heavy loads hit the other guests on the same physical hosts.
> > > >
> > > > The effect on RHCS (v2.0) is quite dramatic.  A forward jump in time
> by
> > > 10
> > > > seconds always causes a member to prematurely time-out on a network
> > read,
> > > > causing the member to drop out and trigger a cluster reconfiguration.
> > > > Apparently NTP is integrated with RHCS version 3, but I don't know
> what
> > > is
> > > > meant by that.
> > > >
> > > > I guess this post is not entirely relevent to ZK, but I am just
> making
> > > the
> > > > point that virtualisation (of NTP servers and or clients) can cause
> > > > repeated
> > > > premature timeouts.  On Linux, I believe that there is a class of
> > timers
> > > > provided that is immune to this, but I doubt that there is a platform
> > > > independent way of coping with this.
> > > >
> > > > My 2p.
> > > >
> > > > regards,
> > > > Martin
> > > >
> > > > On 18 August 2010 16:53, Patrick Hunt <ph...@apache.org> wrote:
> > > >
> > > > > Do you expect the time to be "wrong" frequently? If ntp is running
> it
> > > > > should never get out of sync more than a small amount. As long as
> > this
> > > is
> > > > > less than ~your timeout you should be fine.
> > > > >
> > > > > Patrick
> > > > >
> > > > >
> > > > > On 08/18/2010 01:04 AM, Qing Yan wrote:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >>    The testcase is fairly simple. We have a client which connects
> to
> > > ZK,
> > > > >> registers an ephemeral node and watches on it. Now change the
> client
> > > > >> machine's time - session killed..
> > > > >>
> > > > >>    Here is the log:
> > > > >>
> > > > >> *2010-08-18 04:24:57,782 INFO
> > > > >> com.taobao.timetunnel2.cluster.service.AgentService: Host name
> > > > >> kgbtest1.corp.alimama.com
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:zookeeper.version=3.2.2-888565, built on 12/08/2009
> > 21:51
> > > > GMT
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:host.name=kgbtest1.corp.alimama.com
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:java.version=1.6.0_13
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:java.vendor=Sun Microsystems Inc.
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:java.home=/usr/java/jdk1.6.0_13/jre
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >>
> > > > >>
> > > >
> > >
> >
> environment:java.class.path=/home/admin/TimeTunnel2/cluster/bin/../conf/agent/:/home/admin/TimeTunnel2/cluster/bin/../lib/slf4j-log4j12-1.5.2.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/slf4j-api-1.5.2.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/timetunnel2-cluster-0.0.1-SNAPSHOT.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/zookeeper-3.2.2.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/log4j-1.2.14.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/gson-1.4.jar:/home/admin/TimeTunnel2/cluster/bin/../lib/zk-recipes.jar
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >>
> > > > >>
> > > >
> > >
> >
> environment:java.library.path=/usr/java/jdk1.6.0_13/jre/lib/amd64/server:/usr/java/jdk1.6.0_13/jre/lib/amd64:/usr/java/jdk1.6.0_13/jre/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:java.io.tmpdir=/tmp
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:java.compiler=<NA>
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:os.name=Linux
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:os.arch=amd64
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:os.version=2.6.18-164.el5
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:user.name=admin
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:user.home=/home/admin
> > > > >> 2010-08-18 04:24:57,789 INFO org.apache.zookeeper.ZooKeeper:
> Client
> > > > >> environment:user.dir=/home/admin/TimeTunnel2/cluster/log
> > > > >> 2010-08-18 04:24:57,790 INFO org.apache.zookeeper.ZooKeeper:
> > > Initiating
> > > > >> client connection, connectString=
> > xentest10-vm5.corp.alimama.com:2181,
> > > > >> xentest10-vm6.corp.alimama.com:2181,
> > > xentest10-vm9.corp.alimama.com:2181
> > > > >> sessionTimeout=600000
> > > > >>
> watcher=com.taobao.timetunnel2.cluster.service.agentserv...@48d6c16c
> > > > >> 2010-08-18 04:24:57,791 INFO org.apache.zookeeper.ClientCnxn:
> > > > >> zookeeper.disableAutoWatchReset is false
> > > > >> 2010-08-18 04:24:57,817 INFO org.apache.zookeeper.ClientCnxn:
> > > Attempting
> > > > >> connection to server
> > xentest10-vm5.corp.alimama.com/10.254.5.160:2181
> > > > >> 2010-08-18 04:24:57,818 INFO org.apache.zookeeper.ClientCnxn:
> > Priming
> > > > >> connection to java.nio.channels.SocketChannel[connected local=/
> > > > >> 10.254.2.160:48536 remote=
> > > > >> xentest10-vm5.corp.alimama.com/10.254.5.160:2181]
> > > > >> 2010-08-18 04:24:57,821 INFO org.apache.zookeeper.ClientCnxn:
> Server
> > > > >> connection successful
> > > > >> 2010-08-18 04:24:57,832 INFO
> > > > >> com.taobao.timetunnel2.cluster.service.AgentService: Register...
> > > > >> 2010-08-18 04:24:57,834 DEBUG org.apache.zookeeper.ClientCnxn:
> > Reading
> > > > >> reply
> > > > >> sessionid:0x12a8413f71a0003, packet:: clientPath:null
> > serverPath:null
> > > > >> finished:false header:: -8,101  replyHeader:: -8,-1,0  request::
> > > > >> 0,v{},v{},v{}  response:: null
> > > > >> 2010-08-18 04:24:57,846 DEBUG org.apache.zookeeper.ClientCnxn:
> > Reading
> > > > >> reply
> > > > >> sessionid:0x12a8413f71a0003, packet:: clientPath:null
> > serverPath:null
> > > > >> finished:false header:: 1,1  replyHeader:: 1,8589934649,-110
> > >  request::
> > > > >> '/agent,,v{s{31,s{'world,'anyone}}},0  response::
> > > > >> 2010-08-18 04:24:57,853 DEBUG
> > > > >> com.taobao.timetunnel2.cluster.zookeeper.operation.PathCreator:
> Path
> > > > >> /agent
> > > > >> already exist
> > > > >> 2010-08-18 04:24:57,882 DEBUG org.apache.zookeeper.ClientCnxn:
> > Reading
> > > > >> reply
> > > > >> sessionid:0x12a8413f71a0003, packet:: clientPath:null
> > serverPath:null
> > > > >> finished:false header:: 2,1  replyHeader:: 2,8589934650,-110
> > >  request::
> > > > >> '/agent/status,,v{s{31,s{'world,'anyone}}},0  response::
> > > > >> 2010-08-18 04:24:57,882 DEBUG
> > > > >> com.taobao.timetunnel2.cluster.zookeeper.operation.PathCreator:
> Path
> > > > >> /agent/status already exist
> > > > >> 2010-08-18 04:24:57,886 DEBUG org.apache.zookeeper.ClientCnxn:
> > Reading
> > > > >> reply
> > > > >> sessionid:0x12a8413f71a0003, packet:: clientPath:null
> > serverPath:null
> > > > >> finished:false header:: 3,1  replyHeader:: 3,8589934651,0
>  request::
> > > > >> '/agent/status/kgbtest1.corp.alimama.com
> > > ,,v{s{31,s{'world,'anyone}}},1
> > > > >> response:: '/agent/status/kgbtest1.corp.alimama.com
> > > > >> 2010-08-18 04:24:57,886 DEBUG
> > > > >> com.taobao.timetunnel2.cluster.zookeeper.operation.PathCreator:
> Path
> > > > >> /agent/status/kgbtest1.corp.alimama.com creation OK
> > > > >> 2010-08-18 04:24:57,886 INFO
> > > > >> com.taobao.timetunnel2.cluster.service.AgentService: Watch for ->
> > > > >> /agent/status/kgbtest1.corp.alimama.com
> > > > >> 2010-08-18 04:24:57,892 DEBUG org.apache.zookeeper.ClientCnxn:
> > Reading
> > > > >> reply
> > > > >> sessionid:0x12a8413f71a0003, packet:: clientPath:null
> > serverPath:null
> > > > >> finished:false header:: 4,4  replyHeader:: 4,-1,0  request::
> > > > >> '/agent/status/
> > > > >> kgbtest1.corp.alimama.com,T  response::
> > > > >>
> > > > >>
> > > >
> > >
> >
> ,s{8589934651,8589934651,1282116165777,1282116165777,0,0,0,84024764344696835,0,0,8589934651}
> > > > >>
> > > > >> 2010-08-18 04:24:57,892 INFO
> > > > >> com.taobao.timetunnel2.cluster.service.AgentService: Data changed
> > > > >> @/agent/status/kgbtest1.corp.alimama.com
> > > > >> 2010-08-18 04:25:11,226 DEBUG org.apache.zookeeper.ClientCnxn: Got
> > > ping
> > > > >> response for sessionid:0x12a8413f71a0003 after 1ms
> > > > >> 2010-08-18 04:25:24,560 DEBUG org.apache.zookeeper.ClientCnxn: Got
> > > ping
> > > > >> response for sessionid:0x12a8413f71a0003 after 2ms
> > > > >> 2010-08-18 04:25:37,893 DEBUG org.apache.zookeeper.ClientCnxn: Got
> > > ping
> > > > >> response for sessionid:0x12a8413f71a0003 after 1ms
> > > > >> 2010-08-18 04:19:44,853 DEBUG org.apache.zookeeper.ClientCnxn: Got
> > > > >> notification sessionid:0x12a8413f71a0003
> > > > >> 2010-08-18 04:19:44,854 DEBUG org.apache.zookeeper.ClientCnxn: Got
> > > > >> WatchedEvent: Znode change. Path:
> > > > >> /agent/status/kgbtest1.corp.alimama.comType: NodeDeleted for
> > sessionid
> > > > >> 0x12a8413f71a0003
> > > > >> 2010-08-18 04:19:44,854 DEBUG
> > > > >>
> com.taobao.timetunnel2.cluster.zookeeper.operation.PathDataWatcher:
> > > > >> getChildren event =>WatchedEvent: Znode change. Path:
> /agent/status/
> > > > >> kgbtest1.corp.alimama.com Type: NodeDeleted
> > > > >> 2010-08-18 04:19:44,855 WARN org.apache.zookeeper.ClientCnxn:
> > > Exception
> > > > >> closing session 0x12a8413f71a0003 to
> > > > sun.nio.ch.selectionkeyi...@7ca3d4cf
> > > > >> java.io.IOException: Read error rc = -1
> > > java.nio.DirectByteBuffer[pos=0
> > > > >> lim=4 cap=4]
> > > > >>     at
> > > > >>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(zookeeper:ClientCnxn.java):701)
> > > > >>     at
> > > > >>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):945)
> > > > >> 2010-08-18 04:19:45,591 INFO org.apache.zookeeper.ClientCnxn:
> > > Attempting
> > > > >> connection to server
> > xentest10-vm6.corp.alimama.com/10.254.5.161:2181
> > > > >> 2010-08-18 04:19:45,591 INFO org.apache.zookeeper.ClientCnxn:
> > Priming
> > > > >> connection to java.nio.channels.SocketChannel[connected local=/
> > > > >> 10.254.2.160:45776 remote=
> > > > >> xentest10-vm6.corp.alimama.com/10.254.5.161:2181]
> > > > >> 2010-08-18 04:19:45,591 INFO org.apache.zookeeper.ClientCnxn:
> Server
> > > > >> connection successful
> > > > >> 2010-08-18 04:19:45,593 WARN org.apache.zookeeper.ClientCnxn:
> > > Exception
> > > > >> closing session 0x12a8413f71a0003 to
> > > > sun.nio.ch.selectionkeyi...@2fdb7df8
> > > > >> java.io.IOException: Session Expired
> > > > >>     at
> > > > >>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(zookeeper:ClientCnxn.java):589)
> > > > >>     at
> > > > >>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(zookeeper:ClientCnxn.java):709)
> > > > >>     at
> > > > >>
> > > > >>
> > > >
> > >
> >
> org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):945)
> > > > >>
> > > > >> *   I guess ZK might rely on timestamp to  keep sessions alive,
> but
> > we
> > > > >> have
> > > > >> NTP daemon running so machine time can get changed
> > > > >> automatically, is there a conflict?
> > > > >>
> > > > >>
> > > >
> > >
> >
>

Reply via email to