Ah thanks, I forgot the "majority-commit" property because I also forgot that all servers know what the cluster should look like, rather than act adaptively (which wouldn't make sense after all).

.. Adam On Wed, Aug 11, 2010 at 3:23 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > Can't happen. > > In a network partition, the side without a quorum can't update the file > version. > > On Wed, Aug 11, 2010 at 3:11 PM, Adam Rosien <a...@rosien.net> wrote: > >> What happens during a network partition and different clients are >> incrementing "different" counters, and then the partition goes away? >> Won't (potentially) the same sequence value be given out to two >> clients? >> >> .. Adam >> >> On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway >> <jonathan.hollo...@gmail.com> wrote: >> > Hi Ted, >> > >> > Thanks for the comments. >> > >> > I might have overlooked something here, but is it also possible to do the >> > following: >> > >> > 1. Create a PERSISTENT node >> > 2. Have multiple clients set the data on the node, e.g. Stat stat = >> > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1); >> > 3. Use the version number from stat.getVersion() as the sequence >> (obviously >> > I'm limited to Integer.MAX_VALUE) >> > >> > Are there any weird race conditions involved here which would mean that a >> > client would receive the wrong Stat object back? >> > >> > Many thanks again, >> > Jon. >> > >> > On 5 August 2010 16:09, Ted Dunning <ted.dunn...@gmail.com> wrote: >> > >> >> (b) >> >> >> >> BUT: >> >> >> >> Sequential numbering is a special case of "now". In large diameters, >> now >> >> gets very expensive. This is a special case of that assertion. If >> there >> >> is >> >> a way to get away from this presumption of the need for sequential >> >> numbering, you will be miles better off. >> >> >> >> HOWEVER: >> >> >> >> ZK can do better than you suggest. Incrementing a counter does involve >> >> potential contention, but you will very likely be able to get to pretty >> >> high >> >> rates before the optimistic locking begins to fail. If you code your >> >> update >> >> with a few tries at full speed followed by some form of retry back-off, >> you >> >> should get pretty close to the best possible performance. >> >> >> >> You might also try building a lock with an ephemeral file before >> updating >> >> the counter. I would expect that this will be slower than the back-off >> >> option if only because involves more transactions in ZK. IF you wanted >> to >> >> get too complicated for your own good, you could have a secondary >> strategy >> >> flag that is only sampled by all clients every few seconds and is >> updated >> >> whenever a client needs to back-off more than say 5 steps. If this flag >> >> has >> >> been updated recently, then clients should switch to the locking >> protocol. >> >> You might even have several locks so that you don't exclude all other >> >> updaters, merely thin them out a bit. This flagged strategy would run >> as >> >> fast as optimistic locking as long as optimistic locking is fast and >> then >> >> would limit the total number of transactions needed under very high >> load. >> >> >> >> On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway < >> >> jonathan.hollo...@gmail.com> wrote: >> >> >> >> > My so far involve: >> >> > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this >> >> gives >> >> > me the monotonically increasing number, but the sequence number isn't >> >> > contiguous >> >> > b) Storing the sequence number in the data portion of a persistent >> node - >> >> > then updating this (using the version number - aka optimistic >> locking). >> >> > The >> >> > problem with this is that under high load I'm assuming there'll be a >> lot >> >> of >> >> > contention and hence failures with regards to updates. >> >> > >> >> >> > >> >