Re: ZFS operations hanging, but no visible errors?
- In reply to: Andriy Gapon : "Re: ZFS operations hanging, but no visible errors?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 11 Nov 2021 13:27:42 UTC
Following up on a new hang this same system had (yesterday freebsd-fs mail subject "swap_pager: cannot allocate bio”), I think the same problem might have occurred again. Certainly the system got stuck again Based on the below, my executing that dtrace command caused the system to report "ACPI Error: AE_NO_MEMORY”. In what way is the system out of memory here? And, does that failure running dtrace suggest that that “out of memory” problem is the core problem causing the ZFS hang in the first place? My system has 128GB, which is nothing to sneeze at. Are there parameters that I should change because the normal parameters just don’t work well with a pool or fs this large? And, from earlier in this thread from last week: Now that I have the system running again, I can provide the "zpool status” for information. Let me know if I’ve just tried something crazy here, this is the largest ZFS filesystem I’ve attempted. I have a 30T pool on another system without issue, and with less RAM. (The largest fs on that pool is about 18T) % zfs status pool: tank state: ONLINE scan: scrub repaired 0B in 05:05:55 with 0 errors on Sat Oct 23 04:38:36 2021 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da3 ONLINE 0 0 0 da2 ONLINE 0 0 0 da1 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 errors: No known data errors % zfs list tank NAME USED AVAIL REFER MOUNTPOINT tank 14.2T 35.0T 14.2T /tank - Chris > On Nov 7, 2021, at 03:35, Andriy Gapon <avg@freebsd.org> wrote: > > On 05/11/2021 18:59, Chris Ross wrote: >> Running prostate -kk on the rsync that was hung, then killed, then SIGKILL’d shows: >> procstat -kk 35220 >> PID TID COMM TDNAME KSTACK >> 35220 102499 rsync - mi_switch+0xc1 _sleep+0x1cb vm_wait_doms+0xe2 vm_wait_domain+0x51 vm_domain_alloc_fail+0x86 vm_page_alloc_domain_after+0x7e uma_small_alloc+0x58 keg_alloc_slab+0xba zone_import+0xee zone_alloc_item+0x6f abd_alloc_chunks+0x61 abd_alloc+0x102 arc_hdr_alloc_abd+0xb0 arc_hdr_alloc+0x11e arc_read+0x4f4 dbuf_issue_final_prefetch+0x108 dbuf_prefetch_impl+0x3d0 dmu_zfetch+0x558 > > Looks like the system is out of memory. > It seems that you already established that. > > -- > Andriy Gapon >