I originally started testing a prototype for an enterprise file service
implementation on our campus using S10U4. Scalability in terms of file
system count was pretty bad, anything over a couple of thousand and
operations started taking way too long.
I had thought there were a number of improvements/enhancements that had
been made since then to improve performance and scalability when a large
number of file systems exist. I've been testing with SXCE (b97) which
presumably has all of the enhancements (and potentially then some) that
will be available in U6, and I'm still seeing very poor scalability once
more than a few thousand filesystems are created.
I have a test install on an x4500 with two TB disks as a ZFS root pool, 44
TB disks configured as mirror pairs belonging to one zpool, and the last
two TB disks as hot spares.
At about 5000 filesystems, it starts taking over 30 seconds to
create/delete additional filesystems.
At 7848, over a minute:
# time zfs create export/user/test
real 1m22.950s
user 1m12.268s
sys 0m10.184s
I did a little experiment with truss:
# truss -c zfs create export/user/test2
syscall seconds calls errors
_exit .000 1
read .004 892
open .023 67 2
close .001 80
brk .006 653
getpid .037 8598
mount .006 1
sysi86 .000 1
ioctl 115.534 31303678 7920
execve .000 1
fcntl .000 18
openat .000 2
mkdir .000 1
getppriv .000 1
getprivimplinfo .000 1
issetugid .000 4
sigaction .000 1
sigfillset .000 1
getcontext .000 1
setustack .000 1
mmap .000 78
munmap .000 28
xstat .000 65 21
lxstat .000 1 1
getrlimit .000 1
memcntl .000 16
sysconfig .000 5
lwp_sigmask .000 2
lwp_private .000 1
llseek .084 15819
door_info .000 13
door_call .103 8391
schedctl .000 1
resolvepath .000 19
getdents64 .000 4
stat64 .000 3
fstat64 .000 98
zone_getattr .000 1
zone_lookup .000 2
-------- ------ ----
sys totals: 115.804 31338551 7944
usr time: 107.174
elapsed: 897.670
and it seems the majority of time is spent in ioctl calls, specifically:
ioctl(16, MNTIOC_GETMNTENT, 0x08045A60) = 0
Interestingly, I tested creating 6 filesystems simultaneously, which took a
total of only three minutes, rather than 9 minutes had they been created
sequentially. I'm not sure how parallelizable I can make an identity
management provisioning system though.
Was I mistaken about the increased scalability that was going to be
available? Is there anything I could configure differently to improve this
performance? We are going to need about 30,000 filesystems to cover our
faculty, staff, students, and group project directories. We do have 5
x4500's which will be allocated to the task, so about 6000 filesystems per.
Depending on what time of the quarter it is, our identity management sytem
can create hundreds up to thousands of accounts, and when we purge accounts
quarterly we typically delete 10,000 or so. Currently those jobs only take
2-6 hours, with this level of performance from ZFS they would take days if
not over a week :(.
Thanks for any suggestions. What is the internal recommendation on maximum
number of file systems per server?
--
Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst | [EMAIL PROTECTED]
California State Polytechnic University | Pomona CA 91768
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss