Ted, it is generous to do this analysis. Thanks a lot.
It is in the situation 2, it's important to keep a disk in only one
diskPair.Your suggested flow would fit. But it's best to keep the state with
the disk rather than diskPair. When one disk goes bad, pick a HOTSPARE state
disk, change the state to DUMPING, begin to copy data from the other disk of
the pair, when data ready, update state to ONLINE to serve clients.
发件人: Ted Dunning <ted.dunn...@gmail.com>
发送时间: 2010-03-30 13:51
主 题: Re: Re: How to ensure trasaction create-and-update
Do the DISK objects contain a reference to a DISK-PAIR? What about the
Can DISK's be in more than one DISK-PAIR?
I will assume some answers to these so I can give a design.
Suppose that DISK's just contain other information, but do not refer to
DISK-PAIR's and and can only exist in a single pair.
If so, then the state really should be in the DISK-PAIR and the update
should proceed this way:
create DISK-PAIR referring to the two disks and with state = ONLINE
No update to the DISK is necessary.
If, on the contrary, the DISK's should have a reference to the DISK-PAIR to
ensure that they are never in more than one DISK-PAIR, the update would
proceed this way:
create DISK-PAIR referring to the two disks with state = OFFLINE
read the first disk state.
If it has been assigned to a PAIR, abort, otherwise set it to refer to
the new DISK-PAIR. If the update fails, delete the DISK-PAIR and abort.
read the second disk.
if it has been assigned to a PAIR, unassign the first disk and abort
otherwise set it to refer to the new DISK-PAIR. If the update fails,
unwind the update to the first disk, delete the DISK-PAIR and abort
update the DISK-PAIR to have state = ONLINE
If desired, you can update each disk at this point to have state = ONLINE.
This is really just to speed up checking if a disk is on-line. The logic
if a disk is not a member of a pair, it is off-line
else if a disk has state = ONLINE, it is on-line
else if the pair for the disk has state = ONLINE, the disk is on-line
(this check will only happen very rarely for very newly paired disks)
This update sequence is guaranteed to succeed or fail completely. Moreover,
a disk can only be in a single pair and the online status of a disk can be
determined by checking the disk to see if it is in a pair and if that pair
has state ONLINE.
Can you move the ONLINE state to the DISK-PAIR?
On Mon, Mar 29, 2010 at 7:24 PM, zd.wbh <zd....@163.com> wrote:
> Thanks for your quick reply, Ted.
> we are implementing a distributed system, using zookeeper for master
> metedata persistence. There's DISK object and DISK-PAIR object. when
> creating a DISK-PAIR, we need to first create a znode indicating DISK-PAIR
> object and updating the corresponding two DISK's state from DISK_OFFLINE to
> DISK_ONLINE, these operations need to be done as a whole.
> 发件人: Ted Dunning <ted.dunn...@gmail.com>
> 发送时间: 2010-03-30 10:11
> 主 题: Re: How to ensure trasaction create-and-update
> 收件人: email@example.com
> This is not a good thing. ZK gains lots of its power and reliability by
> trying to do atomic updates to multiple znodes at once.
> Can you say more about the update that you want to do? It is common for
> updates like to be such that you can order the updates and do without a
> truly atomic transaction. For instance if one file is a list of other
> (say for a queue) and you need to create a file and add a reference in the
> list of files, you can generally be safe creating the new file first and
> then doing an atomic update on the list of files secondly. If your process
> fails between the two operations, then you may generate a small number of
> garbage files (this number can be substantially decreased by careful use of
> try/finally) which might require a cleanup process to run occasionally to
> find unreferenced and old files.
> On Mon, Mar 29, 2010 at 6:54 PM, zd.wbh <zd....@163.com> wrote:
> > we'd like to store some metadata in zookeeper in our upcoming project,
> > here is a special but common case: we need to create a new znode, in the
> > mean while, update another znode data. These manipulation(a create and a
> > update) need to be done as atom. We don't want to see a successful
> > and a failure updating. Is there a convenient way to ensure this
> > Can you give me some tips?
> > I've looked into the src code, there is a tedious way to do. Extend
> > zookeeper instruction, struct a "createAndUpdate" interface and a txn
> > request, let DataTree to ensure the integrity. Will this do and the only
> > way?