NFSD hang
Kirill Yelizarov
ykirill at yahoo.com
Mon Sep 26 10:56:26 UTC 2011
# uname -a
FreeBSD brat.faberlic.com 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Jun 9 11:22:38 MSD 2011 root@**:/usr/obj/usr/src/sys/BRAT amd64 Sources were taken at that time
There are a lot of this. Should i paste them all here or part is enough?
brat# procstat -k -k 1666
PID TID COMM TDNAME KSTACK
1666 100323 nfsd nfsd: master mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_run+0x8b nfssvc_nfsd+0x97 nfssvc_nfsserver+0x53 nfssvc+0x44 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2
1666 100391 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100392 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100393 nfsd nfsd: service <running>
1666 100394 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100395 nfsd nfsd: service <running>
1666 100396 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100397 nfsd nfsd: service <running>
1666 100398 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100399 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100400 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100401 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100402 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100403 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100404 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100405 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100406 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100407 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100408 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100409 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100410 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100411 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100412 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100413 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100414 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100415 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100416 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100417 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100418 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100419 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100420 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100421 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100422 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100423 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100424 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100425 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100426 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100427 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100428 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100429 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100430 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100431 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100432 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100433 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100434 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100435 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100436 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100437 nfsd nfsd: service <running>
1666 100438 nfsd nfsd: service <running>
1666 100439 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100440 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100441 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100442 nfsd nfsd: service <running>
1666 100443 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100444 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100445 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100446 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
1666 100447 nfsd nfsd: service mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d svc_run_internal+0x939 svc_thread_start+0xb fork_exit+0x114 fork_trampoline+0xe
________________________________
From: Jeremy Chadwick <freebsd at jdc.parodius.com>
To: Kirill Yelizarov <ykirill at yahoo.com>
Cc: "freebsd-stable at freebsd.org" <freebsd-stable at freebsd.org>
Sent: Monday, September 26, 2011 10:32 AM
Subject: Re: NFSD hang
On Sun, Sep 25, 2011 at 11:14:30PM -0700, Kirill Yelizarov wrote:
> Good Day!
> I'v got a problem with nfs share on zfs volume. Everything worked fine for a few month and now it hang. This share stores logs from 9 servers at night, about 1-2Gb from each server. ZFS is filled to 26% and it is v28
>
> last pid: 46573;? load averages: 195.82, 199.86, 200.12?????????????????????????????????????????????????????????????????????????????? up 108+21:56:50 10:05:06
> 432 processes: 208 running, 224 sleeping
> CPU:? 0.0% user,? 0.0% nice,? 100% system,? 0.0% interrupt,? 0.0% idle
> Mem: 280M Active, 1469M Inact, 9584M Wired, 161M Cache, 1232M Buf, 311M Free
> Swap: 16G Total, 16G Free
>
> ? PID USERNAME????? THR PRI NICE?? SIZE??? RES STATE?? C?? TIME?? WCPU COMMAND
> ?1666 root????????? 256? 76??? 0? 5788K? 5120K RUN??? 14 476.8H 1508.64% nfsd
>
> # zpool list
> NAME?? SIZE? ALLOC?? FREE??? CAP? DEDUP? HEALTH? ALTROOT
> data? 3.62T?? 954G? 2.69T??? 25%? 1.00x? ONLINE? -
>
> # zfs list
> NAME?? USED? AVAIL? REFER? MOUNTPOINT
> data?? 954G? 2.64T?? 954G? /data
>
> # zfs mount
> data??????????????????????????? /data
>
> What should i look for to resolve it?
What version of FreeBSD exactly, and what build date?
Please provide output from "procstat -k -k 1666" (yes, two -k's).
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |
More information about the freebsd-stable
mailing list