From nobody Sun Nov 17 02:10:52 2024 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4XrZ4B6q8tz5dBYs for ; Sun, 17 Nov 2024 02:11:06 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4XrZ4B3HvXz4646; Sun, 17 Nov 2024 02:11:06 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 4AH2AqKN052361; Sun, 17 Nov 2024 04:10:55 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 4AH2AqKN052361 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 4AH2AqgZ052360; Sun, 17 Nov 2024 04:10:52 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 17 Nov 2024 04:10:52 +0200 From: Konstantin Belousov To: Andriy Gapon Cc: freebsd-fs Subject: Re: tmpfs loses (sub-page chunks of) data? Message-ID: References: <140eb994-ae19-41b4-8f0e-fc4290603ce0@freebsd.org> List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <140eb994-ae19-41b4-8f0e-fc4290603ce0@freebsd.org> X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.1 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on tom.home X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4XrZ4B3HvXz4646 X-Spamd-Bar: ---- On Sat, Nov 16, 2024 at 02:31:25PM +0200, Andriy Gapon wrote: > On 16/11/2024 00:24, Konstantin Belousov wrote: > > On Fri, Nov 15, 2024 at 02:43:22PM +0200, Andriy Gapon wrote: > > > > > > We have a number of servers based on FreeSBD 13.3 that initially write some > > > data to files on tmpfs and then the files are dispatched elsewhere. The > > > writes are done by appending variable sized records to a file. There are no > > > seeks or overwrites. > > > > > > I observe that occasionally (very rarely indeed given the amount of data > > > produced) we get a corrupted file. > > > > > > In all cases so far the corruption follows the same pattern: data range from > > > the end of a record until the next page-aligned boundary is zeroed out. > > > That is, good data always continues from an offset which is multiple of 4096 > > > and the zeroed area never crosses such offsets. > > > > > > Because of the page boundary, I have a suspicion that either tmpfs or, > > > perhaps, the broader VM subsystem might have a race where writing to a page > > > does not mark it dirty. Maybe this is related to paging out of a tmpfs page > > > to the swap. > > > > > > The problem is that I have never been able to observe this happening, the > > > corruption gets detected after the fact, hours after it occurs. > > > > > > If anyone could suggest any areas / changes / techniques to explore the > > > problem, I would be much obliged. > > > > Do you have swap enabled on the problematic machines? > > Yes. > > > Are the files mapped, do you write or read through map? > > I think that mmap is not involved at all. > Files are written from kernel using kern_writev(). Why? > After they are complete, they are read from userland using whatever libssl > uses to read input files (when encrypting). Looks like that's fread(3) / > read(2). Try to reproduce it on the HEAD kernel. Then, turn off swap and see if it is still there. On HEAD, if reproducable without swap, try to add 'pgread' option to the tmpfs moount. I am interested if the issue persists after all the measures listed above.