Re: ZFS deadlock in 14
- Reply: Alexander Motin : "Re: ZFS deadlock in 14"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 22 Aug 2023 18:24:00 UTC
Alexander Motin <mav_at_FreeBSD.org> wrote on Date: Tue, 22 Aug 2023 16:18:12 UTC : > I am waiting for final test results from George Wilson and then will > request quick merge of both to zfs-2.2-release branch. Unfortunately > there are still not many reviewers for the PR, since the code is not > trivial, but at least with the test reports Brian Behlendorf and Mark > Maybee seem to be OK to merge the two PRs into 2.2. If somebody else > have tested and/or reviewed the PR, you may comment on it. I had written to the list that when I tried to test the system doing poudriere builds (initially with your patches) using USE_TMPFS=no so that zfs had to deal with all the file I/O, I instead got only one builder that ended up active, the others never reaching "Builder started": [00:01:34] [01] [00:00:00] Builder starting [00:01:57] [01] [00:00:23] Builder started [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4 [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success [00:03:21] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1 [00:03:21] [02] [00:00:00] Builder starting [00:03:21] [03] [00:00:00] Builder starting [00:03:21] [04] [00:00:00] Builder starting [00:03:21] [05] [00:00:00] Builder starting [00:03:21] [06] [00:00:00] Builder starting [00:03:21] [07] [00:00:00] Builder starting [00:03:22] [08] [00:00:00] Builder starting [00:03:22] [09] [00:00:00] Builder starting [00:03:22] [10] [00:00:00] Builder starting [00:03:22] [11] [00:00:00] Builder starting [00:03:22] [12] [00:00:00] Builder starting [00:03:22] [13] [00:00:00] Builder starting [00:03:22] [14] [00:00:00] Builder starting [00:03:22] [15] [00:00:00] Builder starting [00:03:22] [16] [00:00:00] Builder starting [00:03:22] [17] [00:00:00] Builder starting [00:03:22] [18] [00:00:00] Builder starting [00:03:22] [19] [00:00:00] Builder starting [00:03:22] [20] [00:00:00] Builder starting [00:03:22] [21] [00:00:00] Builder starting [00:03:22] [22] [00:00:00] Builder starting [00:03:22] [23] [00:00:00] Builder starting [00:03:22] [24] [00:00:00] Builder starting [00:03:22] [25] [00:00:00] Builder starting [00:03:22] [26] [00:00:00] Builder starting [00:03:22] [27] [00:00:00] Builder starting [00:03:22] [28] [00:00:00] Builder starting [00:03:22] [29] [00:00:00] Builder starting [00:03:22] [30] [00:00:00] Builder starting [00:03:22] [31] [00:00:00] Builder starting [00:03:22] [32] [00:00:00] Builder starting [00:03:30] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1: Success [00:03:31] [01] [00:00:00] Building devel/gettext-runtime | gettext-runtime-0.22 . . . Top was showing lots of "vlruwk" for the cpdup's. For example: . . . 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23 0.00% cpdup -i0 -o ref 32 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20 0.01% cpdup -i0 -o ref 31 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30 0.01% cpdup -i0 -o ref 30 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18 0.01% cpdup -i0 -o ref 29 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33 0.01% cpdup -i0 -o ref 28 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28 0.00% cpdup -i0 -o ref 27 . . . But those processes did show CPU?? on occasion, as well as *vnode less often. None of the cpdup's was stuck in Removing your patches did not change the behavior. So far I've not seen any similar reports to these resuls that I got the ThreadRipper 1950X that I have access to. I normally use USE_TMPFS=all but that hides the problem and is why I've no clue when the behavior would have started if I'd been using USE_TMPFS=no instead. I never got so far as testing for the kinds of reports I've seen about the deadlock issue. No one has commented one what I reported or if they have done any USE_TMPFS=no style of testing. (I also use ALLOW_MAKE_JOBS=yes .) The ZFS context is a simple single partition context. I use ZFS in order to use bectl BE's, not other reasons. === Mark Millard marklmi at yahoo.com