2012-03-22 20:52, Richard Elling wrote:
Yes, but it is a rare case for 512b sectors.
> It could be more common for 4KB sector disks when ashift=12.
Were there any research or tests regarding storage of many small
files (1-sector sized or close to that) on different vdev layouts?

It is not a common case, so why bother?

I think that a certain Bob F. would disagree, especially
when larger native sectors and ashist=12 come into play.
Namely, one scenario where this is important is automated
storage of thumbnails for websites, or some similar small
objects in vast amounts.

I agree that hordes of 512b files would be rare; 4kb-sized
files (or a bit larger - 2-3 userdata sectors) are a lot
more probable ;)

I believe that such files would use a single-sector-sized set of
indirect blocks (dittoed at least twice), so one single-sector
sized file would use at least 9 sectors in raidz2.

No. You can't account for the metadata that way. Metadata space is not
1:1 with
data space. Metadata tends to get written in 16KB chunks, compressed.

I purportedly made an example of single-sector-sized files.
The way I get it (maybe wrong though), the tree of indirect
blocks (dnode?) for a file is stored separately from other
similar objects. While different L0 blkptr_t objects (BPs)
"parented" by the same L1 object are stored as a single
block on disk (128 BPs sized 128 bytes each = 16kb), further
redundanced and ditto-copied, I believe that L0 BPs from
different files are stored in separate blocks - as well
as L0 BPs parented by different L1 BPs from different
byterange stretches of the same file. Likewise for other
layers of L(N+1) pointers if the file is sufficiently
large (in amount of blocks used to write it).

The BP tree for a file is itself an object for a ZFS dataset,
individually referenced (as inode number) and there's a
pointer to its root from the DMU dnode of the dataset.

If the above rant is true, then the single-block file should
have a single L0 blkptr playing as its whole indirect tree
of block pointers, and that L0 would be stored in a dedicated
block (not shared with other files' BPs), inflated by ditto
copies=2 and raidz/mirror redundancy.


zfs-discuss mailing list

Reply via email to