Re: [zfs-discuss] Re: storage type for ZFS

Robert Milkowski Wed, 18 Apr 2007 02:17:04 -0700

Hello Richard,

Wednesday, April 18, 2007, 7:35:24 AM, you wrote:


RLH> Well, no; his quote did say "software or hardware".

Right, I missed that.


RLH>   The theory is apparently
RLH> that ZFS can do better at detecting (and with redundancy, correcting) 
errors
RLH> if it's dealing with raw hardware, or as nearly so as possible.  Most SANs
RLH> _can_ hand out raw LUNs as well as RAID LUNs, the folks that run them are
RLH> just not used to doing it.


Detecting errors by zfs is the same regardless when redundancy is done
- by default checksums are checked for each block in every pool
configuration.  Now correction is another story and basically you need
to create redundant pool (exception to that are meta data, and with
introduction of user ditto block also user data to some extent).


Now when it comes to HW RAID I wouldn't actually recommend doing RAID
in ZFS, at least not always and not now.
First reason is lacking hot spare support in zfs right now.
In many scenarios when one disk goes wild zfs won't really notice and
your hot spare won't kick-in, and you end up doing silly things while
your pool is not serving data.
As I understand the problem with hot spare is being worked on.


Then if you want RAID5 (yes, people want RAID5) zfs could give you
much less performance for some workloads and for such workloads HW
RAID5 (or even SVM, VxVM RAID-5 with zfs on top) with zfs on top as a
file system (or additionally with dynamic striping between hw r5 luns)
actually makes sense.

If you need RAID-10 and you need lot of bandwidth for sequential
writes (or even not sequential in case of zfs) doing it in software
will halve your actual performance in most cases as you have to move
out twice much data when doing software raid.

Also exposing disks as LUNs is not that well tested on arrays simply
just it's been used much less. For example I had a problem with EMC
CX3-40 when each disk was exposed as a LUN and raid-10 zfs pool was
created - when I pulled out a
disk the array didn't catch it on a lun level neither a host - IOs
were queuing up, I waited for about an hour (this was a test system)
and nothing happened. Of course hot spare didn't kick in.

SVM, VxVM or HW hot spares just work.
ZFS hot spare support is far behind right now.
It has better potential but it just doesn't work properly in many
failure scenarios.

RLH> Another issue that may come up with SANs and/or hardware RAID:
RLH> supposedly, storage systems with large non-volatile caches will tend to 
have
RLH> poor performance with ZFS, because ZFS issues cache flush commands as
RLH> part of committing every transaction group; this is worse if the filesystem
RLH> is also being used for NFS service.  Most such hardware can be
RLH> configured to ignore cache flushing commands, which is safe as long as
RLH> the cache is non-volatile.

In most arrays (if not in all) exposing physical disk as a lun won't
solve above, so it's not a choice between doing raid in hw or in zfs.


As always, if you really care about performance and availability you
have to know what you are doing. And while zfs does some miracles in
some environments it actually makes sense to do raid in HW or other
volume manager and use zfs solely as a file system.
In case when doing raid in HW and using zfs as a file system I would
recommend always exposing 3 luns or more (or at least 2) and then do a
dynamic striping on zfs side. Of course those luns should be on
different physical disks. That way you should have a better protection
for your meta data.



-- 
Best regards,
 Robert Milkowski                      mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: storage type for ZFS

Reply via email to