From nobody Tue Nov 14 21:30:25 2023 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SVKGQ5BQyz50vFY for ; Tue, 14 Nov 2023 21:30:38 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SVKGQ3Mpfz3MRS; Tue, 14 Nov 2023 21:30:38 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-2809fb0027cso4866865a91.2; Tue, 14 Nov 2023 13:30:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699997436; x=1700602236; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ynpJJEgI87HbRiPj/s+rARS1qW0KMOY9dtNYc3x3LDY=; b=ClU/EJmL0iDJ6Cl57UzMRyzKt05/c404W7nBldjjU7f90z1C96AzeJw5lAAY/yhhus qgop35hCR/JUnsUZUwzLPvctaZjjByZwlCAAcoymwZm0i8A/uMqXLj/aMo85o7xn9qgP gABJEuRwvvvMBPgtcPyO+WI6kiL6JQhaX8F4pdvYK1omSfX4x7HDshLkC+NK8HAYmCBl bRdgEqjlvJeZPRupLJtE+hOfdybCgRwHeKl85XZoxKNBxWexNsSmowMthDxs3xc3Qfc3 xpLBANwBCschtdWBSRYmf/keSEZMpTr7nNyQtfFv7ul+wHdyCrRhbiuHUuE5CDj1319x Ilag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699997436; x=1700602236; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ynpJJEgI87HbRiPj/s+rARS1qW0KMOY9dtNYc3x3LDY=; b=ae99vg9jaCC/Q1YNWseAFfx62iCDLbkbLYkf/lQnbk1LDR8xj3zwVyMWV0bac6Qw8A B/11RpBttmQHDO+E91Z5sKk+pzs0HxeLCTaJ7Fx2/KffiyPpnpZGBCSB2MkeezrQ7dq7 wzRGJDAMnRJvT8vmCqzcPZCzSZyvq2kI5TXrpmUW7mIUe6Xasay+L0qMXxrMvSK06YXk M5LHjrLGsnSHduZkIXV8SB/cU9e3RMrrOjwvybZnPcJrZp3BFgNh4pLNtfgyEySZxkMT rlpextGAOPHpK0HYbshjhPSN7Dx6TUOgzzYqhVYOZ0uJMYGN8GQiuqIAZZQX78/gZcam u9oA== X-Gm-Message-State: AOJu0YwQUn8czGX6smMkjZP54GE74OSWulJqmjxBJFhnDKxDcybrAvl9 54qaDiY413WMALAmWHPA5h/BKV7iuvdRj1pC6g== X-Google-Smtp-Source: AGHT+IFzuQ+JasTzl9WKgl5prHr+cWpJNnHLT5NCaamn8ln+h3bx398yfQNv7sxFSM1Pak2SqEl4QTpUriYBqOsw8T8= X-Received: by 2002:a17:90a:e7c7:b0:26d:17da:5e9f with SMTP id kb7-20020a17090ae7c700b0026d17da5e9fmr8993577pjb.1.1699997435892; Tue, 14 Nov 2023 13:30:35 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <349700057.3452.1699611152405@localhost> <1900239445.5968.1699966796547@localhost> In-Reply-To: From: Rick Macklem Date: Tue, 14 Nov 2023 13:30:25 -0800 Message-ID: Subject: Re: crash zfs_clone_range() To: Konstantin Belousov Cc: Mateusz Guzik , Alexander Motin , Ronald Klop , current@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4SVKGQ3Mpfz3MRS On Tue, Nov 14, 2023 at 1:20=E2=80=AFPM Konstantin Belousov wrote: > > On Tue, Nov 14, 2023 at 06:47:46PM +0100, Mateusz Guzik wrote: > > On 11/14/23, Alexander Motin wrote: > > > On 14.11.2023 12:39, Mateusz Guzik wrote: > > >> One of the vnodes is probably not zfs, I suspect this will do it > > >> (untested): > > >> > > >> diff --git a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.= c > > >> b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > >> index 107cd69c756c..e799a7091b8e 100644 > > >> --- a/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > >> +++ b/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c > > >> @@ -6270,6 +6270,11 @@ zfs_freebsd_copy_file_range(struct > > >> vop_copy_file_range_args *ap) > > >> goto bad_write_fallback; > > >> } > > >> } > > >> + > > >> + if (invp->v_mount->mnt_vfc !=3D outvp->v_mount->mnt_vfc) { > > >> + goto bad_write_fallback; > > >> + } > > >> + > > >> if (invp =3D=3D outvp) { > > >> if (vn_lock(outvp, LK_EXCLUSIVE) !=3D 0) { > > >> goto bad_write_fallback; > > >> > > > > > > vn_copy_file_range() verifies for that: > > > > > > /* > > > * If the two vnodes are for the same file system type, call > > > * VOP_COPY_FILE_RANGE(), otherwise call > > > vn_generic_copy_file_range() > > > * which can handle copies across multiple file system types= . > > > */ > > > *lenp =3D len; > > > if (inmp =3D=3D outmp || strcmp(inmp->mnt_vfc->vfc_name, > > > outmp->mnt_vfc->vfc_name) =3D=3D 0) > > > error =3D VOP_COPY_FILE_RANGE(invp, inoffp, outvp, o= utoffp, > > > lenp, flags, incred, outcred, fsize_td); > > > else > > > error =3D vn_generic_copy_file_range(invp, inoffp, o= utvp, > > > outoffp, lenp, flags, incred, outcred, fsize_td)= ; > > > > > > > > > > The crash at hand comes from nullfs. If "outward" vnodes are both > > nullfs, but only one underlying vnode is zfs, you get the above. > > If this is the reason, the check must be done by nullfs bypass for > vop_copy_file_range(). I suppose this is a reasonable alternative, although it means that all stacked file systems will need the check. It just seems easier to do it in the actual VOPs, but it is up to others. Btw, the stuff above the VOP_COPY_FILE_RANGE() that busies the mounts and checks mnt_vfc being the same could be dropped, if the VOP_COPY_FILE_RANGE() calls like NFS were careful to lock the vnodes before doing a "same fs type or same mount" check. (I suppose that would be a subtle change in VOP semantics that is arguably not allowed for a minor version.) Anyhow, I am happy with whatever others decide. rick >