On Mar 28, 2010, at 6:21 PM, Daniel Carosone wrote: > There's been some talk about alignment lately, both for flash and WD disks. > > What's missing, at least from my perspective, is a clear an > unambiguous test so users can verify that their zfs pools are aligned > correctly. This should be a test that sees through all the layers of > BIOS and SMI/EFI and zfs labels and their accumulated offsets, and > lets us ascertain where things land in terms of addresses that matter > to the storage. > > I can think of two methods to achieve this, but lack information to > complete either. I'd appreciate some help - or a better way. > > #1. Use xxd (or similar) to examine the contents of the raw disk > device that ignores partitioning (e.g. c0d0p0). Search for a known > label magic number or similar content, and determine its address. > Apply arithmetic as necessary. > > This relies on knowing what to look for, and how that is aligned to > the start of the partition and to to metaslab addresses and offsets > that determine the writes we actually care about. > > #2. Use dtrace to watch the actual sector addresses being accessed > when examining a pool (e.g. zdb -l). Apply arithmetic as necessary. > > This relies on dtrace clue. > > As for the arithmetic.. I'm not certain I've seen, for example, a > definitive statement of what the alignment offset is between > start-of-partition and zfs data blocks, once various preamble header > sectors are allowed for.
This is documented in the ZFS on disk format doc. Uberblocks are aligned to 256KB offsets from the beginning of the slice. The first metaslab starts 4MB from the start of the device. Use prtvtoc or format to see the beginning of the slice relative to the beginning of the partition. I dunno how you tell the start of the partition relative to the physical device. To get an idea of the objects, "zdb -dd poolname" will show object properties. Do not be surprised that the metadata is compressed and small. Other zdb options will show allocations, DVAs, and pretty much anything else. The attached dtrace script is a basic analysis of 4KB aligned starts (relative to the start of the slice) and 4KB size alignment. Don't be surprised to see very little alignment due to compression, metadata, and other oddities. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
align4k.d
Description: Binary data
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss