Hi.

  snv_39, SPARC - nfs server with local ZFS filesystems.
Under heavy load traffic to all filesystems in one pool ceased - it was ok for 
other pools.
By ceased I mean that 'zpool iostat 1' showed no traffic to that pool 
(nfs-s5-p0).

Commands like 'df' or 'zfs list' hang.
I issued 'reboot -k' but it didn't worked, neither 'halt' command.
So I issued sync from OBP - after restart server started ok and is working 
properly so far.
I have crashdump (there're zfs list, df command hand).

>From a crashdump:

> ::ps
S    PID   PPID   PGID    SID    UID      FLAGS             ADDR NAME
R      0      0      0      0      0 0x00000001 0000000001836cc0 sched
R      3      0      0      0      0 0x00020001 0000060000dedb90 fsflush
R      2      0      0      0      0 0x00020001 0000060000dee778 pageout
R      1      0      0      0      0 0x4a004000 0000060000def360 init
R   3054      1   3054   3048      0 0x4a014000 00000600127f7008 bash
R   3070   3054   3070   3048      0 0x4a004000 000006000284dba8 reboot
R   3013      1   3013   3007      0 0x4a014000 0000060002b5fbb8 bash
R   3038   3013   3038   3007      0 0x4a004000 00000600127f4c50 sync
R   3015   3013   3015   3007      0 0x4a004000 0000060002b5cc18 sync
R   2995      1   2995   2989      0 0x4a014000 0000060002a32798 bash
R   2997   2995   2997   2989      0 0x4a004000 0000060002b5c030 zfs
R    367      1    367    361      0 0x4a014000 0000060002a2ec10 bash
R   2970    367   2970    361      0 0x4a004000 00000600127f5838 df
R   2143      1   2143   2143      1 0x42300002 00000600127f93c0 nfsd
R    357      1    356    356      0 0x42000000 0000060000f0abf8 snmpd
R    296      1    296    296      0 0x42000000 00000600025eac00 mdmonitord
R    228      1    228    228      0 0x42000000 0000060000f0cfb0 inetd
Z    311    228    228    228      0 0x4a004002 0000060002a30fc8 rpc.metad
R      7      1      7      7      0 0x42000000 0000060000dec3c0 svc.startd
R    237      7    237    237      0 0x4a004000 000006000284e790 sh
R   3077    237   3077    237      0 0x4a014000 000006000284b7f0 bash
R   3084   3077   3084    237      0 0x4a004000 0000060002b5e3e8 halt
R   3083   3077   3083    237      0 0x4a004000 0000060009355000 sync
Z    221      7    221    221      0 0x4a014002 00000600025eb7e8 sac
> 00000600127f5838::walk thread|::findstack -v
stack pointer for thread 300a12b5020: 2a104880841
[ 000002a104880841 cv_wait+0x40() ]
  000002a1048808f1 zio_wait+0x30(300bbe45900, 300bbe45900, 300bbe45b68, 
300bbe45b60, 0, 11)
  000002a1048809a1 dmu_buf_hold+0x84(0, 0, 5, 0, 2a104881318, 0)
  000002a104880a61 zap_lockdir+0x18(60003127468, 3, 0, 1, 1, 2a104881638)
  000002a104880b21 zap_cursor_retrieve+0x44(2a104881630, 2a104881518, 3, 0, 
2a104881630, 2)
  000002a104880c41 dsl_prop_get_all+0xf4(3002bc6ef70, 2a104881820, 1, 
60002a3f8c0, 6001bb77540, 7b244c2c)
  000002a104880f61 zfs_ioc_objset_stats+0x84(60003def000, 0, 0, 60003defb60, 
198, 7007ef08)
  000002a104881031 zfsdev_ioctl+0x158(7007ec00, 33, ffbfdc00, 11, 44, 
60003def000)
  000002a1048810e1 fop_ioctl+0x20(300adb49ec0, 5a11, ffbfdc00, 100003, 
60000c02798, 120c888)
  000002a104881191 ioctl+0x184(3, 6001e2b5118, ffbfdc00, ffffffff, 40490, 5a11)
  000002a1048812e1 syscall_trap32+0xcc(3, 5a11, ffbfdc00, ffffffff, 40490, 
80808080)
>

> 0000060002b5c030::walk thread|::findstack -v
stack pointer for thread 30045016380: 2a102e9a841
[ 000002a102e9a841 cv_wait+0x40() ]
  000002a102e9a8f1 dbuf_read+0x1ac(3000f8e1dc0, 2, 3000f8e1e38, 3000f8e1dc0, 0, 
2)
  000002a102e9a9a1 dmu_buf_hold+0x84(0, 0, 5, 0, 2a102e9b318, 0)
  000002a102e9aa61 zap_lockdir+0x18(60003127468, 3, 0, 1, 1, 2a102e9b638)
  000002a102e9ab21 zap_cursor_retrieve+0x44(2a102e9b630, 2a102e9b518, 3, 0, 
2a102e9b630, 2)
  000002a102e9ac41 dsl_prop_get_all+0xf4(3002bc6fdc0, 2a102e9b820, 1, 
60002a3f8c0, 6001bb77540, 7b244c2c)
  000002a102e9af61 zfs_ioc_objset_stats+0x84(30052b4e000, 0, 0, 30052b4eb60, 
198, 7007ef08)
  000002a102e9b031 zfsdev_ioctl+0x158(7007ec00, 33, ffbfde20, 11, 44, 
30052b4e000)
  000002a102e9b0e1 fop_ioctl+0x20(300adb49ec0, 5a11, ffbfde20, 100003, 
300a7c3ca60, 120c888)
  000002a102e9b191 ioctl+0x184(4, 60000db84a0, ffbfde20, 4, 40490, 5a11)
  000002a102e9b2e1 syscall_trap32+0xcc(4, 5a11, ffbfde20, 4, 40490, 80808080)
>


Looks like some kind of deadlock???

If crashdump is needed I can provided it - but off the list and not for public 
eyes.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to