livelock in vfs_bio_getpages with vn_io_fault_uiomove
Date: Thu, 30 Sep 2021 21:11:02 UTC
I'm trying to adapt fusefs to use vn_io_fault_uiomove to fix the deadlock described in the comments above vn_io_fault_doio [^1]. I can reproduce the deadlock readily enough, and the fix seems simple enough in other filesystems [^2][^3]. But when I try to apply the same fix to fusefs, the deadlock changes into a livelock. vfs_bio_getpages loops infinitely because it reaches the "redo = true" state. But on looping, it never attempts to read from fusefs again. Instead, breadn_flags returns 0 without ever calling bufstrategy, from which I infer that getblkx returned a block with B_CACHE set. Despite that, at least one of the requested pages in vfs_bio_getpages fails the vm_page_all_valid(ma[i]) check. Debugging further is wandering outside my areas of expertise. Could somebody please give me a tip? What is supposed to mark those pages as valid? Are there any other undocumented conditions needed to use vn_io_fault_uiomove that msdosfs and nfscl just happened to already meet? Grateful for any help, -Alan [^1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238340 [^2] https://github.com/freebsd/freebsd-src/commit/2aa3944510b50cbe6999344985a5a9c3208063b2 [^3] https://github.com/freebsd/freebsd-src/commit/ddfc47fdc98460b757c6d1dbe4562a0a339f228b