On Thu, May 12, 2011 at 12:23:55PM +1000, Daniel Carosone wrote:
> They were also sent from an ashift=9 to an ashift=12 pool

This reminded me to post a note describing how I made pools with
different ashift.  I do this both for pools on usb flash sticks, and
on disks with an underlying 4k blocksize, such as my 2Tb WD EARS
drives.  If I had pools on SATA Flash SSDs, I'd do it for those too.

The trick comes from noting that stmfadm create-lu has a blk option
for the block size of the iscsi volume to be presented.  Creating a
pool with at least one disk (per top-level vdev) on an iscsi initiator
pointing at such a target will cause zpool to set ashift for the vdev 
accordingly.  

This works even when the initiator and target are the same host, over
the loopback interface.  Oddly, however, it does not work if the host
is solaris express b151 - it does work on OI b148.  Something has
changed in zpool creating in the interim.

Anyway, my recipe is to:

 * boot OI b148 in a vm. 
 * make a zfs dataset to house the working files (reason will be clear
   below).
 * In that dataset, I make sparse files corresponding in size and
   number to the disks that will eventually hold the pool (this makes
   a pool with the same size and number of metaslabs as would have
   been natively).
 * Also make a sparse zvol of the same size.
 * stmfadm create-lu -p blk=4096 (or whatever, as desired) on the
   zvol, and make available.
 * get the iscsi initiator to connect the lu as a new disk device
 * zpool create, using all bar 1 of the files, and the iscsi disk, in
   the shape you want your pool (raidz2, etc).
 * zpool replace the iscsi disk with the last unused file (now you can
   tear down the lu and zvol)
 * zpool export the pool-on-files.
 * zfs send the dataset housing these files to the machine that has
   the actual disks (much faster than rsync even with the sparse files
   option, since it doesn't have to scan for holes).
 * zpool import the pool from the files
 * zpool upgrade, if you want newer pool features, like crypto.
 * zpool set autoexpand=on, if you didn't actually use files of the
   same size.
 * zpool replace a file at a time onto the real disks.

Hmm.. when written out like that, it looks a lot more complex than it
really is.. :-)

Note that if you want lots of mirrors, you'll need an iscsi device per
mirror top-level vdev.

Note also that the image created inside the iscsi device is not
identical to what you want on a device with 512-byte sector emulation,
since the label is constructed for a 4k logical sector size.  zpool
replace takes care of this when labelling the replacement disk/file.

I also played around with another method, using mdb to overwrite the
disk model table to match my disks and make the pool directly on them
with the right ashift.

  http://fxr.watson.org/fxr/ident?v=OPENSOLARIS;im=10;i=sd_flash_dev_table

This also no longer works on b151 (though the table still exists), so I
need the vm anyway, and the iscsi method is easier. 

Finally, because this doesn't work on b151, it's also only good for
creating new pools; I don't know how to expand a pool with new vdevs
to have the right ashift in those vdevs. 

--
Dan.

Attachment: pgp0weQ3gx757.pgp
Description: PGP signature

_______________________________________________
zfs-crypto-discuss mailing list
zfs-crypto-disc...@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-crypto-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to