From nobody Thu Apr 04 20:56:31 2024 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4V9Ynm1C7Yz5G1sH for ; Thu, 4 Apr 2024 20:56:44 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4V9Ynl6LFJz48gT; Thu, 4 Apr 2024 20:56:43 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-5e152c757a5so1063872a12.2; Thu, 04 Apr 2024 13:56:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712264202; x=1712869002; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nlSBVHCs0YVCZ0uSZJLknAlnnOSyNQQNN0+5MKsXJj0=; b=LyZV3buwAYATpxL7N6Q1KfhKfpEkq1MF93w3L38NNyRyNZXTT8CYIghRfifivpcNdM KuFXJDX8CyK6BMex7oDxyPAL/xR80MCJzvmSUEEScQT0R9zfsO3aYdB+KbfOsy2iJSSt nn0i0833JTVD1XH+WoK/705A9IjkD6QF1m6ZAP6Fg4ut0huPfQxU2PdYPIsWjn40jv51 LwJFOPQLFtOcFvGrOgtRvLUDlWabsdxaMNbN9XTgNZqCSLIs+ehuTs8O6L8drxKI1ych Svl6rDk68TTleYKXKdq346KsYv2hnMmmi9p58v5hyDKXVAIGL8FJibum85N//odtY+RA pFRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712264202; x=1712869002; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nlSBVHCs0YVCZ0uSZJLknAlnnOSyNQQNN0+5MKsXJj0=; b=TpYmu1mKGC/3Ohw7KTf7/CEfvJkWy/eRS7Hf2E42mHkUUh2pxHIZngU1MWky5tjV7d Eq+6blEjRLev1BkZYDuiBdXKOCo5JmuxGrKtszPTwPyOkGJJA/l+MubgV9lh8tDWC8Lv 2KiNf4z01D1kG5K2c57Fl/rngK2GW7mMMKtvyi/u5KGBD1OXl/pKM7UhSZVBxyI/xM2q eqiPUtNZnxbyIhaEW6mL0x0UT6Mr5pLCF6PmC4zPq28iTofqnCoRh1yT3X8Bg7z0o4Av sGr1E1BrYmm1lwFkePssR/ofhWCuunq+XL0Wv8balM2fvIfhvZrjdVYEMtK/f48v3Q3f AJ2w== X-Gm-Message-State: AOJu0YyaHYLU1Pj+PF9MYCLKSdI3JA679/t5BsfL8cteGaC1PRbgShah D6NRpELez7GcG0pTLJpkJmGfK4CZ7D7FyoRfabaebLuDKgm4GHUjsM5K1vsjQzwmjHsM5kfP368 QHO+eFPaM1EkbN0F3xPLVEolLv1IY2+BOqQ== X-Google-Smtp-Source: AGHT+IH1V+VwAToujy/3QQ3S5iwlqyKw1Oowm+DTo05EkNaed3UlukfjF4qBEigyrx1Zw2VvQbWjhklYDLKaCPkmuQ0= X-Received: by 2002:a17:90b:33c7:b0:29c:7566:a1d6 with SMTP id lk7-20020a17090b33c700b0029c7566a1d6mr3457947pjb.25.1712264201868; Thu, 04 Apr 2024 13:56:41 -0700 (PDT) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Thu, 4 Apr 2024 13:56:31 -0700 Message-ID: Subject: Re: SEEK_HOLE at EOF To: Alan Somers Cc: FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4V9Ynl6LFJz48gT On Thu, Apr 4, 2024 at 11:15=E2=80=AFAM Alan Somers w= rote: > > tldr; there are two problems: > 1) tmpfs handles SEEK_HOLE differently than other file systems > 2) everything else handles SEEK_HOLE at EOF poorly, IMHO > > Details: > > According to lseek(2), SEEK_HOLE should return the start of the next > hole greater than or equal to the supplied offset. Also, each file > has a zero-sized virtual hole at the very end of the file. So I would > expect that calling SEEK_HOLE at EOF would return the file's size. > However, the man page also says that SEEK_HOLE will return ENXIO when > the offset points to EOF. Those two statements seem contradictory to > me. The first behavior seems more logical. I would expect SEEK_HOLE > to work the same way both at EOF and at any other file offset. > > What does the spec say? > > There is no POSIX standard for this. It was invented by Solaris, > Illumos's man page does not say clearly say what should happen at EOF. > Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and > offset is beyond the end of the file". That would seem to indicate > behavior 1: SEEK_HOLE should return the file's size at EOF. Only > beyond EOF should it return ENXIO. Well, there is the Austin Group stuff (never ratified by POSIX as I understand it). Here's what it says about SEEK_HOLE and offset: If whence is SEEK_HOLE, the file offset shall be set to the smallest location of a byte within a hole and not less than offset, except that if offset falls within the last hole, then the file offset may be set to the file size instead. It shall be an error if offset is greater or equal to the size of the file. I'd suggest we follow this, since it is the closest to a standard that ther= e is. rick > > But what do other implementations do? > > Contrary to its man page, Linux behaves mostly like FreeBSD. SEEK_HOLE > returns ENXIO at EOF on most file systems. I tested a number of file > systems on both FreeBSD and Linux. Most of them return ENXIO. The > only two outliers are FreeBSD's tmpfs and Linux's NFS client. > > FreeBSD Linux > =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D > UFS ENXIO > ZFS ENXIO > tmpfs file size ENXIO > msdosfs ENXIO ENXIO > ext2fs ENXIO ENXIO > xfs ENXIO > tarfs ENXIO > nfs ENXIO file size > > So what should we change? Clearly, it's bad for tmpfs to be > inconsistent. My preference would be for everything to behave like > tmpfs, but it's currently losing the popularity contest. Anybody else > have thoughts? > > -Alan >