From nobody Sat Jan 29 04:37:35 2022 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 30D321986E48; Sat, 29 Jan 2022 04:37:50 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Jm1ld0crkz3hc4; Sat, 29 Jan 2022 04:37:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 20T4bZvB043518 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 29 Jan 2022 06:37:38 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 20T4bZKV043517; Sat, 29 Jan 2022 06:37:35 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 29 Jan 2022 06:37:35 +0200 From: Konstantin Belousov To: peterj@freebsd.org Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: bio re-ordering Message-ID: References: List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.5 X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on tom.home X-Rspamd-Queue-Id: 4Jm1ld0crkz3hc4 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [-0.97 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; TO_DN_NONE(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; NEURAL_HAM_SHORT(-0.97)[-0.973]; NEURAL_SPAM_LONG(1.00)[1.000]; MLMMJ_DEST(0.00)[freebsd-fs,freebsd-geom]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-ThisMailContainsUnwantedMimeParts: N On Sat, Jan 29, 2022 at 03:29:39PM +1100, peterj@freebsd.org wrote: > I'm working on a GEOM Gate network client to better handle high-latency > connections and have some questions regarding bio ordering assumptions > (alternatively, how much should I be able to re-order bio requests without > breaking things). Within geom_gate, an incoming bio request is retrieved > from the kernel using a G_GATE_CMD_START ioctl, processed in userland > (typically by forwarding it to a remote system) and then returned via a > G_GATE_CMD_DONE ioctl. My GEOM Gate client can reorder requests quite > aggressively and I suspect it's breaking some kernel assumptions regarding > bio behaviour. The following questions assume that BIO_READ, BIO_WRITE and > BIO_FLUSH are valid but BIO_DELETE isn't supported. > > a) In the absence of BIO_FLUSH operations, what (if any) are the limits on > reordering operations? Given a block that initially contains A, followed > by a write B, read and write C, is there any constraint on which content > the read returns? There are no limits. Either other software entities, or hardware itself, can process requests in arbitrary order. This is why things are typically done in the completion handler, and part of the reason why the complexity of UFS SU exists. > > b) Are individual BIO_READ and BIO_WRITE operations expected to be atomic > with respect to other BIO_WRITE operations? Give 2 adjacent blocks that > initially contain AB, and successive write CD, read and write EF > operations to those blocks, is it expected that the read would return CD > (or maybe AD or EF, assuming that's valid from the previous question) or > could the write operations partially complete in different orders, > resulting in something like AD, CF, EB etc? No. At very least, underlying entities can split request into several, each of which is ordered individiually. Typically, it is higher-level code that ensures that there are no concurrent modifications of the same block. For instance, we exclusively lock vnodes and buffers around metadata updates. Similarly, we lock buffers until the data is written to the device. > > b) I assume that a BIO_FLUSH should not return DONE until all preceeding > write operations have completed issued. Is it required that write > operations issued after the BIO_FLUSH must not complete before the > BIO_FLUSH completes? UFS SU relies on BIO_FLUSH being the full barrier.