yes, you are right. we could do this. it turns out that the expiration code is very simple:

            while (running) {
                currentTime = System.currentTimeMillis();
                if (nextExpirationTime > currentTime) {
                    this.wait(nextExpirationTime - currentTime);
                    continue;
                }
                SessionSet set;
                set = sessionSets.remove(nextExpirationTime);
                if (set != null) {
                    for (SessionImpl s : set.sessions) {
sessionsById.remove(s.sessionId); expirer.expire(s);
                    }
                }
                nextExpirationTime += expirationInterval;
            }

so we can detect a jump very easily: if nextExpirationTime > currentTime, we have jumped ahead in time.

now the question is, what do we do with this information?

option 1) we could figure out the jump (nextExpirationTime-currentTime is a good estimate) and move all of the sessions forward by that amount. option 2) we could converge on the time by having a policy to always wait at least a half a tick time.

there probably are other options as well. i kind of like option 2. worst case is it will make the sessions expire in half the time that they should, but this shouldn't be too much of a problem since clients send a ping if they are idle for 1/3 of their session timeout.

ben

On 08/19/2010 08:39 AM, Ted Dunning wrote:
True.  But it knows that there has been a jump.

Quiet time can be distinguished from clock shift by assuming that members of
the cluster
don't all jump at the same time.

I would imagine that a "recent clock jump" estimate could be kept and
buckets that would
otherwise expire due to such a jump could be given a bit of a second lease
on life, delaying
all of their expiration.  Since time-outs are relatively short, the server
would be able to forget
about the bump very shortly.

On Thu, Aug 19, 2010 at 8:22 AM, Benjamin Reed<br...@yahoo-inc.com>  wrote:

if we try to use network messages to detect and correct the situation, it
seems like we would recreate the problem we are having with ntp, since that
is exactly what it does.


Reply via email to