sorry to insist, but still no real answer...
On Mon, 16 Jul 2012, Bob Friesenhahn wrote:
On Tue, 17 Jul 2012, Michael Hase wrote:
So only one thing left: mirror should read 2x
I don't think that mirror should necessarily read 2x faster even though the
potential is there to do so. Last I heard, zfs did not include a special
read scheduler for sequential reads from a mirrored pair. As a result, 50%
of the time, a read will be scheduled for a device which already has a read
scheduled. If this is indeed true, the typical performance would be 150%.
There may be some other scheduling factor (e.g. estimate of busyness) which
might still allow zfs to select the right side and do better than that.
If you were to add a second vdev (i.e. stripe) then you should see very close
to 200% due to the default round-robin scheduling of the writes.
My expectation would be > 200%, as 4 disks are involved. It may not be the
perfect 4x scaling, but imho it should be (and is for a scsi system) more
than half of the theoretical throughput. This is solaris or a solaris
derivative, not linux ;-)
It is really difficult to measure zfs read performance due to caching
effects. One way to do it is to write a large file (containing random data
such as returned from /dev/urandom) to a zfs filesystem, unmount the
filesystem, remount the filesystem, and then time how long it takes to read
the file once. The reason why this works is because remounting the
filesystem restarts the filesystem cache.
Ok, did a zpool export/import cycle between the dd read and write test.
This really empties the arc, checked this with arc_summary.pl. the test
even uses two processes in parallel (doesn't make a difference). Result is
still the same:
dd write: 2x 58 MB/sec --> perfect, each disk does > 110 MB/sec
dd read: 2x 68 MB/sec --> imho too slow, about 68 MB/sec per disk
For writes each disk gets 900 128k io requests/sec with asvc_t in the 8-9
msec range. For reads each disk only gets 500 io requests/sec, asvc_t
18-20 msec with the default zfs_vdev_maxpending=10. When reducing
zfs_vdev_maxpending the asvc_t drops accordingly, the i/o rate remains at
500/sec per disk, throughput also the same. I think iostat values should
be reliable here. These high iops numbers make sense as we work on empty
pools so there aren't very high seek times.
All benchmarks (dd, bonnie, will try iozone) lead to the same result: on
the sata mirror pair read performance is in the range of a single disk.
For the sas disks (only two available for testing) and for the scsi system
there is quite good throughput scaling.
Here for comparison a table for 1-4 36gb 15k u320 scsi disks on an old
sxde box (nevada b130):
seq write factor seq read factor
MB/sec MB/sec
single 82 1 78 1
mirror 79 1 137 1.75
2x mirror 120 1.5 251 3.2
This is exactly what's imho to be expected from mirrors and striped
mirrors. It just doesn't happen for my sata pool. Still have no reference
numbers for other sata pools, just one with the 4k/512bytes sector problem
which is even slower than mine. It seems the zfs performance people just
use sas disks and be done.
Michael
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
old ibm dual opteron intellistation with external hp msa30, 36gb 15k u320 scsi
disks
####################
pool: scsi1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
scsi1 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
errors: No known data errors
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
zfssingle 16G 137 99 82739 20 39453 9 314 99 78251 7 856.9 8
Latency 160ms 4799ms 5292ms 43210us 3274ms 2069ms
Version 1.96 ------Sequential Create------ --------Random Create--------
zfssingle -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 8819 34 +++++ +++ 26318 68 20390 73 +++++ +++ 26846 72
Latency 16413us 108us 231us 12206us 46us 124us
1.96,1.96,zfssingle,1,1342514790,16G,,137,99,82739,20,39453,9,314,99,78251,7,856.9,8,16,,,,,8819,34,+++++,+++,26318,68,20390,73,+++++,+++,26846,72,160ms,4799ms,5292ms,43210us,3274ms,2069ms,16413us,108us,231us,12206us,46us,124us
######################
pool: scsi1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
scsi1 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0
errors: No known data errors
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
zfsmirror 16G 110 99 79137 19 50591 12 305 99 137244 13 1065 16
Latency 199ms 4932ms 5101ms 50429us 3885ms 1303ms
Version 1.96 ------Sequential Create------ --------Random Create--------
zfsmirror -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 11337 41 +++++ +++ 26398 66 19797 70 +++++ +++ 26299 68
Latency 14297us 139us 136us 10732us 48us 116us
1.96,1.96,zfsmirror,1,1342515696,16G,,110,99,79137,19,50591,12,305,99,137244,13,1065,16,16,,,,,11337,41,+++++,+++,26398,66,19797,70,+++++,+++,26299,68,199ms,4932ms,5101ms,50429us,3885ms,1303ms,14297us,139us,136us,10732us,48us,116us
########################
pool: scsi1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
scsi1 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0
c3t5d0 ONLINE 0 0 0
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
zfsraid10 16G 127 99 120319 30 86902 23 300 99 251493 26 1747
27
Latency 105ms 3078ms 5083ms 43082us 3657ms 360ms
Version 1.96 ------Sequential Create------ --------Random Create--------
zfsraid10 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 12031 46 +++++ +++ 25764 64 21220 75 +++++ +++ 27288 69
Latency 14091us 123us 136us 10823us 49us 117us
1.96,1.96,zfsraid10,1,1342515541,16G,,127,99,120319,30,86902,23,300,99,251493,26,1747,27,16,,,,,12031,46,+++++,+++,25764,64,21220,75,+++++,+++,27288,69,105ms,3078ms,5083ms,43082us,3657ms,360ms,14091us,123us,136us,10823us,49us,117us
####################
dd write
--------
for FILE in bigfile1 bigfile2
do
time /usr/gnu/bin/dd if=/dev/zero of=$FILE bs=1024k count=8192 &
done
8589934592 bytes (8.6 GB) copied, 108.421 s, 79.2 MB/s
8589934592 bytes (8.6 GB) copied, 112.788 s, 76.2 MB/s
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
scsi1 14.1G 53.4G 0 1.11K 0 140M
mirror 7.03G 26.7G 0 571 0 70.2M
c1t4d0 - - 0 571 0 70.2M
c3t4d0 - - 0 674 0 82.8M
mirror 7.02G 26.7G 0 567 0 69.9M
c1t5d0 - - 0 567 0 69.9M
c3t5d0 - - 0 669 0 82.4M
---------- ----- ----- ----- ----- ----- -----
dd read
-------
for FILE in bigfile1 bigfile2
do
time /usr/gnu/bin/dd if=$FILE of=/dev/null bs=1024k count=8192 &
done
8589934592 bytes (8.6 GB) copied, 62.2953 s, 138 MB/s
8589934592 bytes (8.6 GB) copied, 62.8319 s, 137 MB/s
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
scsi1 16.0G 51.5G 2.08K 0 261M 0
mirror 7.99G 25.8G 1.06K 0 133M 0
c1t4d0 - - 535 0 66.6M 0
c3t4d0 - - 544 0 67.6M 0
mirror 8.01G 25.7G 1.02K 0 128M 0
c1t5d0 - - 518 0 64.3M 0
c3t5d0 - - 516 0 64.4M 0
---------- ----- ----- ----- ----- ----- -----
dd write
--------
for FILE in bigfile1 bigfile2
do
time /usr/gnu/bin/dd if=/dev/zero of=$FILE bs=1024k count=16384 &
done
17179869184 bytes (17 GB) copied, 294.442 s, 58.3 MB/s
17179869184 bytes (17 GB) copied, 294.28 s, 58.4 MB/s
capacity operations bandwidth
pool alloc free read write read write
----------- ----- ----- ----- ----- ----- -----
ptest 40.6G 887G 0 1000 0 113M
mirror 40.6G 887G 0 1000 0 113M
c5t9d0 - - 0 935 0 111M
c5t10d0 - - 0 946 0 113M
----------- ----- ----- ----- ----- ----- -----
extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
0.0 907.0 0.0 106.3 0.0 7.7 0.0 8.5 1 79 c5t9d0
0.0 914.0 0.0 107.3 0.0 7.7 0.0 8.5 1 80 c5t10d0
############
zpool export ptest
zpool import ptest
arc_summary.pl
System Memory:
Physical RAM: 16375 MB
Free Memory : 2490 MB
LotsFree: 255 MB
ZFS Tunables (/etc/system):
ARC Size:
Current Size: 85 MB (arcsize)
Target Size (Adaptive): 12690 MB (c)
Min Size (Hard Limit): 1918 MB (zfs_arc_min)
Max Size (Hard Limit): 15351 MB (zfs_arc_max)
############
dd read
-------
for FILE in bigfile1 bigfile2
do
time /usr/gnu/bin/dd if=$FILE of=/dev/null bs=1024k count=16384 &
done
17179869184 bytes (17 GB) copied, 253.017 s, 67.9 MB/s
17179869184 bytes (17 GB) copied, 253.567 s, 67.8 MB/s
capacity operations bandwidth
pool alloc free read write read write
----------- ----- ----- ----- ----- ----- -----
ptest 71.1G 857G 1008 0 125M 0
mirror 71.1G 857G 1008 0 125M 0
c5t9d0 - - 517 0 64.2M 0
c5t10d0 - - 491 0 61.0M 0
----------- ----- ----- ----- ----- ----- -----
extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
519.0 1.0 64.0 0.0 0.0 10.0 0.0 19.2 1 100 c5t9d0
521.5 0.5 64.8 0.0 0.0 10.0 0.0 19.1 1 100 c5t10d0
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss