[Bug 260453] ZFS truncated write to O_APPEND file from mmap'ed memory
Date: Wed, 15 Dec 2021 22:09:55 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260453 Mark Johnston <markj@FreeBSD.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |Open CC| |markj@FreeBSD.org --- Comment #1 from Mark Johnston <markj@FreeBSD.org> --- I dug into this a little bit. The program creates a file and writes some data via write(2) (so data does not appear in the page cache, only in the DMU). Immediately after, the file is mapped for reading and data from the mapping is written to a different file, and I see: 1988 aardwarc CALL lseek(0x6,0,SEEK_END) 1988 aardwarc RET lseek 2378/0x94a 1988 aardwarc CALL writev(0x6,0x7fffffffe360,0x2) 1988 aardwarc PFLT 0x8002634cb 0x1<VM_PROT_READ> 1988 aardwarc PRET KERN_PROTECTION_FAILURE 1988 aardwarc GIO fd 6 wrote 479 bytes <file data> 1988 aardwarc RET writev 479/0x1df 1988 aardwarc CALL lseek(0x6,0,SEEK_END) 1988 aardwarc RET lseek 2570/0xa0a So we get a page fault while reading from the mapping, which is expected, and dmu_write_uio_dbuf() returns EFAULT, which is the magic signal for vn_io_fault1() to retry after wiring the mapping. Some tracing indicates that dmu_write_uio_dbuf() does manage to write some data to the file before hitting EFAULT. In fact, the amount of data written in the first try is exactly 479 - (2570 - 2378) bytes. I think the bug is that the EFAULT causes this bit of code in zfs_write() to be skipped: /* * Update the file size (zp_size) if it has changed; * account for possible concurrent updates. */ while ((end_size = zp->z_size) < zfs_uio_offset(uio)) { (void) atomic_cas_64(&zp->z_size, end_size, zfs_uio_offset(uio)); ASSERT(error == 0); } z_size contains the file size returned by VOP_GETATTR(), used to provide the return value for lseek(SEEK_END). But... this bug appears to exist in main too. So how does this work at all? Note, part of the weirdness here comes from the fact that some of the input file data is written in the first try. I would expect dmu_write_uio_dbuf() to return EFAULT without having written anything. -- You are receiving this mail because: You are the assignee for the bug.