Hardware: Supermicro server with Adaptec 5405 SAS controller, LSI expander ->
24 drives. Currently using 2x 1tb SAS drives striped and 1x750gb SATA as
another pool. I don't think hardware is related though as if I turn off zfs
compression it's fine - I seem to get same behavior on either pool. The ONLY
thing I can think of distinct is I use a USB flash drive for root, performance
on root pool is horrible but system works fine.
If I do a copy with ZFS compression=gzip-9 then I'll get Solaris hung for
several seconds. I have iostat -xcnCXTdz 5 running, so it SHOULD be displaying
stats every 5 seconds. The results: 06:01:20, then 06:02:04 (44 seconds).
Thu May 22 06:01:20 2008
cpu
us sy wt id
0 13 0 86
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
253.5 0.0 16524.7 0.0 0.0 14.1 0.0 55.6 0 55 c4
121.0 0.0 8140.8 0.0 0.0 8.5 0.0 70.2 0 30 c4t0d0
132.6 0.0 8383.9 0.0 0.0 5.6 0.0 42.2 0 25 c4t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
Thu May 22 06:02:04 2008
cpu
us sy wt id
0 98 0 2
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
42.4 38.7 2590.2 2752.5 0.0 1.9 0.0 24.0 0 8 c4
21.5 19.1 1313.4 1353.3 0.0 1.1 0.0 26.2 0 4 c4t0d0
20.8 19.5 1276.9 1399.2 0.0 0.9 0.0 21.7 0 4 c4t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0
0.0 0.0 0.9 0.0 0.0 0.0 0.1 11.8 0 0 c6
0.0 0.0 0.9 0.0 0.0 0.0 0.1 11.8 0 0 c6t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
Thu May 22 06:02:09 2008
cpu
us sy wt id
0 6 0 94
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
27.4 249.4 2164.0 14078.1 0.0 68.9 0.0 249.1 0 200 c4
15.0 128.8 1238.5 7252.9 0.0 34.3 0.0 238.8 0 100 c4t0d0
12.4 120.6 925.5 6825.2 0.0 34.6 0.0 260.1 0 100 c4t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
Thu May 22 06:02:16 2008
cpu
us sy wt id
0 82 0 18
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
54.4 14.8 3907.3 558.2 0.0 9.0 0.0 129.7 0 41 c4
26.0 7.2 1891.3 282.6 0.0 4.2 0.0 126.7 0 18 c4t0d0
28.3 7.6 2016.0 275.6 0.0 4.8 0.0 132.5 0 22 c4t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
I notice the copy is still going, but I'm back to semi-responsive possibly when
the second file starts (7 seconds instead of 5). This seems like the
compression thread(s) are too high of priority.
The files I'm copying for my test are:
-rw-r--r-- 1 root root 2240902488 2008-05-21 19:32 it-20080106.zfs
-rw-r--r-- 1 root root 1381914720 2008-05-21 19:40 it-20080131.zfs
They are zfs send logs, so pretty large. They are also compressed.
What concerns me about this isn't that I've successfully overloaded the cpu,
that's to be expected - But that NOTHING seems to run at that point. The
scheduler IMHO should be taking care of other requests instead of giving zfs
compression all the cpu - i.e. if I try to ssh to the box I can't log in while
this runs, for almost a minute - it's just unresponsive. I didn't test enough
other things, but I assume the entire system is hung.
I also noticed (perhaps by design) that a copy with compression off almost
instantly returns, but the writes continue LONG after the cp process claims to
be done. Is this normal? Wouldn't closing the file ensure it was written to
disk? Is that tunable somewhere?
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss