>The underlying problem with ext4 is that some kde executables do >something like this: >1a) open and read data from file x, close file x >1b) open and truncate file x >1c) write data to file x >1d) close file x > >or > >2a) open and read data from file x, close file x >2b) open and truncate file x.new >2c) write data to file x.new >2d) close file x.new >2e) rename file x.new to file x > >Concerning case 1) I think ZFS may lose data if power is lost right >after 1b) and open(xxx,O_WRONLY|O_TRUNC|O_CREAT) is issued in a >transaction group separately from the one containing 1c/1d.
Yes, I would assume that is possible but the change for it happening is small. Other filesystems prefer to write the meta data prompt, but ZFS will easily wait until the file is completely written. And UFS has the extra problem that it can change the file size and reading will show garbage in the file. (It changes the inode, possibly because it's in the log, but it hasn't written the data). We've seen that problem with UFS and the /etc/*_* driver files. Precisely because we didn't flush/fsync. (And in some cases we used fsync(fileno(file)), but the new content was still in the stdio buffer) Only "versioned" filesystems can make the first sequence work. >Concerning case 2) I cannot see ZFS losing any data, because of >copy-on-write and transaction grouping. > >Theodore Ts'o (ext4 developer) commented that both cases are flawed and >cannot be supported correctly, because of a lacking fsync() before >close. Is this correct? His comment is over here: >https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54 Perhaps we should Theodore Ts'o that ZFS gets this right :-) I'm assuming that all transactions in group N all happened before group N+1 at least when it comes to the partial order in which the transactions happen. Casper _______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
