[zfs-discuss] Lower latency ZIL Option?: SSD behind Controller BB Write Cache

James Wed, 26 Jan 2011 07:55:28 -0800

I’m wondering if any of the ZIL gurus could examine the following and point out 
anywhere my logic is going wrong.


For small backend systems (e.g. 24x10k SAS Raid 10) I’m expecting an absolute 
maximum backend write throughput of 10000 seq IOPS** and more realistically 
2000-5000.     With small (4kB) blocksizes*, 10k is 480MB over 10s so we don’t 
need much ZIL space or throughput.  What we do need is the ability to absorb 
the IOPS at low latency and keep absorbing them at least as fast as the backend 
storage can commit them.   

ZIL OPTIONS:   Obviously a DDRDrive is the ideal (36k 4k random IOPS***) but 
for the same budget I can get 2x Vertex 2 EX 50GB drives and put each behind 
it’s own P410 512MB BBWC controller.    Assuming the SSDs can do 6300 4k random 
IOPS*** and that the controller cache confirms those writes in the same latency 
as the DDRDrive (both PCIe attached RAM?****) then we should have DDRDrive type 
latency up to 6300 sustained IOPS.    Also, in bursting traffic, we should be 
able to absorb up to 512MB of data (3.5s of 36000 4k IOPS)  at much higher 
IOPS/low latency as long as averages at 6300 (ie SSD can empty cache before 
fills). 

So what are the issues with using this approach for low budget builds looking 
for mirrored ZILs that don’t require >6300 sustained write IOPS (due to backend 
disk limitations?).   Obviously there’s a lot of assumptions here but wanted to 
get my theory straight before I start ordering things to test.

Thanks all.
James

* For NTFS 4kB clusters on VMWare / NFS, I believe 4kB zfs recordsize will 
provide best performance (avoid partial writes).  Thoughts welcome on that too.
** Assumes 10k SAS can do max 900 sequential writes each striped across 12 
mirrors and rounded down (900 based on TomsHardware hdd streaming write bench). 
  Also assumes ZFS can take completely random writes and turn them into 
completely sequential write iops on underlying disks and that no reads,>32k 
writes etc are hitting disk at the same time.     Realistically 2000-5000 is 
probably more likely maximums.
*** Figures from excellent DDRDrive presentation.  NB: If BBWC can 
sequentialise writes to SSD may get closer to 10000 IOPS
**** I’m assuming that P410 BBWC and DDRDrive have similar IOPS/latency profile 
– DDRDrive may do something fancy with striping across RAM to improve IO?

Similar Posts:
http://opensolaris.org/jive/thread.jspa?messageID=460871  - except normal disks 
instead of ssd behind cache (so cache would fill).
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg39729.html - same 
again
 
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Lower latency ZIL Option?: SSD behind Controller BB Write Cache

Reply via email to