From nobody Mon Jan 01 21:55:10 2024 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4T3qXb54qtz55gH0 for ; Mon, 1 Jan 2024 21:55:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4T3qXb1q8qz4NKr for ; Mon, 1 Jan 2024 21:55:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1704146111; a=rsa-sha256; cv=none; b=p+y4RSuLMQCALC39YGKOssbIOOBejdSFdTu5FpDJGC5h29hBcKlHBtbPAFMA77uIGevd/S dpe8MotCyQAxq9aUyl3IZ4JedvJOMIShHZAFmwcXQAu5O+1T1B4+e4PrHOkPqagU5NU82x ZW3OXlS58NyZX5Kd8JwUpG80/u74RUXjo4A74yV/2J2pqADzM8p72o8bNwW/iGirbO2nzh MUpwblY8SjIwM7zAmy5lcAdCjIfLeza1w2b6HgZuC9d3kQ3rZA/Pv9s6DM/EHgLUlFTrpl LympnUdaX9mBxpKpfCsrber1cn4ra/rHGID2ajbeh9my1oAW2Po8/MkpGL7oNA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1704146111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z5d7qQ9r45CfCVVFx1bDYdReFqIgf/fwGj8mEImkiJw=; b=HbCUl6gUN3uush7zhfWiQx2HFq5mPS9yckyYGkuaQLuhqSBYolIzQnjumA6s+UTRdt0Yce X8TTFRY6mromFFNjfDqkBSCtjYedek5cGlQSr4B8gEsDgItiDydbuhFaJvss2OGcqwf8of 5p+Pml8Y+Xt8LdjE0VGG740WhCXRp7tFU/+n8W1I7YRQ82YR8ybgGHQd+mKxqpEJdL01+L ch9/hH+G3egpopmWfPmWxfuRqiN07D4evRdmnqSx18MI35c+TgQ6dyAN8a0jEoFABs6emh vbPHf54+mbnyGKJUOj+SZjcPh3+eon58aYsV1HxNlC4DiI5d3hZ/my+VYdk+Ag== Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4T3qXZ6GsPzZnv for ; Mon, 1 Jan 2024 21:55:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 401LtANS026176 for ; Mon, 1 Jan 2024 21:55:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 401LtAQZ026175 for fs@FreeBSD.org; Mon, 1 Jan 2024 21:55:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 276002] nfscl: data corruption using both copy_file_range and mmap'd I/O Date: Mon, 01 Jan 2024 21:55:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 15.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D276002 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |geoffrey@dommett.com --- Comment #34 from Rick Macklem --- Ok, here is my understanding of what currently can happen. Hopefully Kostik will correct me if I have this wrong. #1 - File is open(2)'d. #2 - A byte range (lets say the 1st 100Mbytes) is mmap(2)'d into the address space #3 - Some addresses within this address space are modified by the process, dirtying the corresponding pages. #4 - File is read(2) sequentially. Now, when #4 happens, there will be read-aheads done by the nfsiod threads. These simply do Read RPCs against the NFS server to read the byte ranges of the file into the buffer cache blocks. They are done asynchronously and without any vnode lock. --> At this time, I do not see anything that stops these read-aheads from filling the buffer cache blocks/pages from the NFS server's now stale data. Now, I thought adding a msync(2) with MS_SYNC between #3 and #4 would be sufficient to cause the pages dirtied by #3 to be written to the NFS server (via VOP_PUTPAGES(), which is ncl_putpages()). I believe that an fsync(2) between #3 and #4 will also write the dirtied pages to the NFS server. Without either a msync(2) or fsync(2) between #3 and #4, what could be done to make this work? - Don't do read-ahead. This would be a major performance hit and is imho a non-starter. - Don't do read-ahead when a file is mmap(2)'d. This sounds better, since it will be a rare case that a file will be both mmap(2)'d and read via read(2) syscalls. --> To do this, the NFS client needs to know if the file has been mmap(2)'d. A flag could be set on the vnode when the file is mmap(2)'d and that flag can be checked by the NFS client. --> The problem is when can the flag be cleared? My recollection from a previous round of discussing this is...not until all the process(es) that mmap(2)'d the file exit. (I cannot recall if the vnode's v_usecount going to 0 is sufficient.) - Having some way that the nfsiod threads can check to see if there are dirty pages related to the buffer cache block and write those back to the NFS server before doing the read. (Recall that the buffer cache block will be quite a few pages, typically 128K to 1Mbyte in size.) --> This could be done by having the nfsiod thread LK_EXCLUSIVE lock the vnode, but that would be a major performance hit, as well. That's as far as I've gotten in previous discussions about this. Note that this PR started with a specific problem related to copy_file_range(2) and that has been fixed (or kib@'s patch will fix it when committed). The more general case as above, well?? --=20 You are receiving this mail because: You are the assignee for the bug.=