Hi guys, when ZFS starts it ends up hanging the system.

We have a raidz over 5 x 2TB disks with 5 ZFS filesystems. (The
root filesystem is on separate disks).

# uname -a
FreeBSD fortinbras.XXX 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Oct 19 09:20:04 
SAST 2011     r...@storage.xxx:/usr/obj/usr/src/sys/GENERIC  amd64

ZFS filesystem version 5
ZFS storage pool version 28

The setup was working great until we decided to try out deduplication.
After testing it on a few files I set deduplication on 1 of the
filesystems and moved around 1.5TB of data into it. There was basically
no space saving, just a big performance hit, so we decided to take it off.
While deleting a directory on this filesystem in preparation the system
hung. After an abnormal half an hour long bootup everything was fine,
and a scrub was clean. We then decided to rather just destroy this
filesystem as it happened to have disposable data on and would be 100
times quicker(?) I ran the zfs destroy command which sat there for about
40 hours while the free space on the pool gradually increased, at which
point the system hung again. Rebooting took hours, stuck at ZFS
initialisation, after which the console returned:

pid 37 (zfs), uid 0, was killed: out of swap space
pid 38 (sh), uid 0, was killed: out of swap space

This's now the state I'm stuck in. I can naturally boot up without ZFS
but once I start it manually the disks in the pool start flashing for
an hour or 2 and the system hangs before they finish doing whatever
they're doing.

The system has 6GB of RAM and a 10GB swap partition. I added a 30GB
swap file but this hasn't helped.

# sysctl hw.physmem
hw.physmem: 6363394048

# sysctl vfs.zfs.arc_max
vfs.zfs.arc_max: 5045088256

(I lowered arc_max to 1GB but hasn't helped)

I've included the output when starting ZFS after setting vfs.zfs.debug=1
at the bottom. There's no more ZFS output for the next few hours while
the disks are flashing and the system is responsive.

A series of top outputs after starting ZFS:

last pid:  1536;  load averages:  0.00,  0.02,  0.08
21 processes:  1 running, 20 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 12M Active, 7344K Inact, 138M Wired, 44K Cache, 10M Buf, 5678M Free
Swap: 39G Total, 39G Free

last pid:  1567;  load averages:  0.13,  0.05,  0.08
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  2.3% system,  2.1% interrupt, 95.6% idle
Mem: 14M Active, 7880K Inact, 328M Wired, 44K Cache, 13M Buf, 5485M Free
Swap: 39G Total, 39G Free

last pid:  1632;  load averages:  0.06,  0.04,  0.06
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  0.5% system,  0.1% interrupt, 99.4% idle
Mem: 14M Active, 8040K Inact, 2421M Wired, 40K Cache, 13M Buf, 3392M Free
Swap: 39G Total, 39G Free

last pid:  1693;  load averages:  0.11,  0.10,  0.08
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice,  0.3% system,  0.1% interrupt, 99.5% idle
Mem: 14M Active, 8220K Inact, 4263M Wired, 40K Cache, 13M Buf, 1550M Free
Swap: 39G Total, 39G Free

last pid:  1767;  load averages:  0.00,  0.00,  0.00
25 processes:  1 running, 24 sleeping
CPU:  0.0% user,  0.0% nice, 27.6% system,  0.0% interrupt, 72.4% idle
Mem: 14M Active, 8212K Inact, 4380M Wired, 40K Cache, 13M Buf, 1433M Free
Swap: 39G Total, 39G Free

*sudden system freeze*

Whether it ends up utilising the swap space and/or thrashing I don't know -
I only know that the zpool disks' fancy LED's have stopped as well as the
general disk access LED. At this point I can still ping the host, and get
a response from telnetting to port 22, but I can't ssh in or access the
console.

Any suggestions would be appreciated, we're rather fond of the data on the
other 4 innocent filesystems ;)


The debug output:

Dec  1 15:56:07 fortinbras kernel: ZFS filesystem version 5
Dec  1 15:56:07 fortinbras kernel: zvol_init:1700[1]: ZVOL Initialized.
Dec  1 15:56:07 fortinbras kernel: ZFS storage pool version 28
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and 
consumer for ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada2...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 
12202424374202010419
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada2.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada1...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 
1791271857894800867
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada1.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada3.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada3.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada3...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 
2075211652913351343
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada3.
Dec  1 15:56:07 fortinbras kernel: 
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada4.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada4.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada4...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 
6076447326626437945
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada4.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada5.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada5.
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada5...
Dec  1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 
16505750022304911009
Dec  1 15:56:07 fortinbras kernel: 
Dec  1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada5.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to 
ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer 
to ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to 
ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer 
to ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to 
ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer 
to ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to 
ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer 
to ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to 
ada5.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer 
to ada5.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_detach:168[1]: Destroyed geom 
zfs::vdev.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and 
consumer for ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: 
Dec  1 15:56:08 fortinbras kernel: Reading guid from ada2...
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 
12202424374202010419
Dec  1 15:56:08 fortinbras kernel: 
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada2.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada1...
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 
1791271857894800867
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada1.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada3...
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 
2075211652913351343
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada3.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada4...
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 
6076447326626437945
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada4.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found 
provider by name /dev/ada5.
Dec  1 15:56:08 fortinbras kernel: 
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5.
Dec  1 15:56:08 fortinbras kernel: 
Dec  1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer 
for ada5.
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid 
from ada5...
Dec  1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 
16505750022304911009
Dec  1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match 
for provider /dev/ada5.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to