> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Schweiss, Chip > > . The ZIL can have any number of SSDs attached either mirror or > individually. ZFS will stripe across these in a raid0 or raid10 fashion > depending on how you configure.
I'm regurgitating something somebody else said - but I don't know where. I believe multiple ZIL devices don't get striped. They get round-robin'd. This means your ZIL can absolutely become a bottleneck, if you're doing sustained high throughput (not high IOPS) sync writes. But the way to prevent that bottleneck is by tuning the ... I don't know the names of the parameters. Some parameters that indicate "a sync write larger than X should skip the ZIL and go directly to pool." > . To determine the true maximum streaming performance of the ZIL setting > sync=disabled will only use the in RAM ZIL. This gives up power protection > to > synchronous writes. There is no RAM ZIL. The basic idea behind ZIL is like this: Some applications simply tell the system to "write" and the system will buffer these writes in memory, and the application will continue processing. But some applications do not want the OS to buffer writes, so they issue writes in "sync" mode. These applications will issue the write command, and they will block there, until the OS says it's written to nonvolatile storage. In ZFS, this means the transaction gets written to the ZIL, and then it gets put into the memory buffer just like any other write. Upon reboot, when the filesystem is mounting, ZFS will always look in the ZIL to see if there are any transactions that have not yet been played to disk. So, when you set sync=disabled, you're just bypassing that step. You're lying to the applications, if they say "I want to know when this is written to disk," and you just immediately say "Yup, it's done" unconditionally. This is the highest performance thing you could possibly do - but depending on your system workload, could put you at risk for data loss. > . Mirroring SSDs is only helpful if one SSD fails at the time of a power > failure. This leave several unanswered questions. How good is ZFS at > detecting that an SSD is no longer a reliable write target? The chance of > silent data corruption is well documented about spinning disks. What chance > of data corruption does this introduce with up to 10 seconds of data written > on SSD. Does ZFS read the ZIL during a scrub to determine if our SSD is > returning what we write to it? Not just power loss -- any ungraceful crash. ZFS doesn't have any way to scrub ZIL devices, so it's not very good at detecting failed ZIL devices. There is definitely the possibility for an SSD to enter a failure mode where you write to it, it doesn't complain, but you wouldn't be able to read it back if you tried. Also, upon ungraceful crash, even if you try to read that data, and fail to get it back, there's no way to know that you should have expected something. So you still don't detect the failure. If you want to maintain your SSD periodically, you should do something like: Remove it as a ZIL device, create a new pool with just this disk in it, write a bunch of random data to the new junk pool, scrub the pool, then destroy the junk pool and return it as a ZIL device to the main pool. This does not guarantee anything - but then - nothing anywhere guarantees anything. This is a good practice, and it definitely puts you into a territory of reliability better than the competing alternatives. > . Zpool versions 19 and higher should be able to survive a ZIL failure only > loosing the uncommitted data. However, I haven't seen good enough > information that I would necessarily trust this yet. That was a very long time ago. (What, 2-3 years?) It's very solid now. > . Several threads seem to suggest a ZIL throughput limit of 1Gb/s with > SSDs. I'm not sure if that is current, but I can't find any reports of > better > performance. I would suspect that DDR drive or Zeus RAM as ZIL would push > past this. Whenever I measure the sustainable throughput of a SSD, HDD, DDRDrive, or anything else ... Very few devices can actually sustain faster than 1Gb/s, for use as a ZIL or anything else. Published specs are often higher, but not realistic. If you are ZIL bandwidth limited, you should consider tuning the size of stuff that goes to ZIL. _______________________________________________ zfs-discuss mailing list email@example.com http://mail.opensolaris.org/mailman/listinfo/zfs-discuss