ZFS deadlock?

Mon Aug 18 08:20:55 UTC 2014

Bengt Ahlgren <bengta at sics.se> writes:

> During a copy (zfs send/recv) of a ~1TB dataset from one zpool to
> another, my system seems to run into some issues.  A simultaneous "find"
> on the source data set deadlocks.  This is the kernel stack:
>
> $ procstat -kk 1786
>   PID    TID COMM             TDNAME           KSTACK                       
>  1786 101344 find             -                mi_switch+0x194 sleepq_wait+0x42 _cv_wait+0x112 zio_wait+0x61 dbuf_read+0x619 dmu_buf_hold+0xe0 zap_get_leaf_byblk+0x4a zap_deref_leaf+0x68 fzap_cursor_retrieve+0xe7 zap_cursor_retrieve+0x155 zfs_freebsd_readdir+0x2d8 VOP_READDIR_APV+0x78 kern_getdirentries+0x212 sys_getdirentries+0x23 amd64_syscall+0x5ea Xfast_syscall+0xf7 
>
> The zfs send/recv has gotten very slow, albeit seems to make very slow
> progress (copy is, as obvious, from p0 to p2):
>
> p0          15.9T  2.20T    318      0  10.2M      0
> p1          11.1T  7.00T      0      0      0      0
> p2          2.55T  41.0T      0      0      0      0
> ----------  -----  -----  -----  -----  -----  -----
> p0          15.9T  2.20T    294      0  9.29M      0
> p1          11.1T  7.00T      0      0      0      0
> p2          2.55T  41.0T      0      0      0      0
> ----------  -----  -----  -----  -----  -----  -----
> p0          15.9T  2.20T    307      0  9.12M      0
> p1          11.1T  7.00T      0      0      0      0
> p2          2.55T  41.0T      0      0      0      0
> ----------  -----  -----  -----  -----  -----  -----
> p0          15.9T  2.20T    293      0  8.69M      0
> p1          11.1T  7.00T      0      0      0      0
> p2          2.55T  41.0T      0     58      0  1.61M
> ----------  -----  -----  -----  -----  -----  -----
> p0          15.9T  2.20T    301      0  10.9M      0
> p1          11.1T  7.00T      0      0      0      0
> p2          2.55T  41.0T      0  1.62K      0  49.6M
> ----------  -----  -----  -----  -----  -----  -----
>
> The machine is otherwise quite idle.  When the copy started, I got
> around 200MB/s, now it's around 10MB/s.
>
> The ARC has gotten large, but that is likely normal:
>
> last pid:  1863;  load averages:  0.20,  0.33,  0.63    up 0+02:27:44  16:31:52
> 50 processes:  1 running, 49 sleeping
> CPU:  0.0% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.8% idle
> Mem: 1688M Active, 61M Inact, 107G Wired, 3288K Cache, 126M Buf, 15G Free
> ARC: 99G Total, 2483M MFU, 89G MRU, 33M Anon, 888M Header, 7427M Other
> Swap: 128G Total, 128G Free
>
>   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
>  1229 root          1  20    0 39700K  3292K piperd  7  24:27   1.07% zfs
>  1228 root          2  20    0 39832K  3420K nanslp  5  17:02   0.39% zfs
> ...
>
> The source pool is pretty filled up, can that be an issue?
>
> $ zpool list
> NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
> p0    18.1T  15.9T  2.20T    87%  1.00x  ONLINE  -
> p1    18.1T  11.1T  7.00T    61%  1.00x  ONLINE  -
> p2    43.5T  2.53T  41.0T     5%  1.00x  ONLINE  -
>
> The machine is running 9.3-REL and has two mps controllers.
>
> Any ideas?

Just for the record: there was no deadlock after all.  It turned out to
be caused by a directory with ~4.5M entries.

Bengt