Re: git: a4e4132fa3bf - main - swapoff(2): replace special device name argument with a structure
Date: Sun, 05 Dec 2021 20:01:13 UTC
On 5 Dec 2021, at 19:53, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Sun, Dec 05, 2021 at 01:38:52PM -0600, Mike Karels wrote: >> On 5 Dec 2021, at 12:56, Jessica Clarke wrote: >> >>> On 5 Dec 2021, at 18:51, Konstantin Belousov <kostikbel@gmail.com> wrote: >>>> On Sun, Dec 05, 2021 at 05:14:54PM +0000, Jessica Clarke wrote: >>>>> On 5 Dec 2021, at 13:22, Konstantin Belousov <kostikbel@gmail.com> wrote: >>>>>> >>>>>> On Sun, Dec 05, 2021 at 03:03:26AM +0000, Jessica Clarke wrote: >>>>>>> On 4 Dec 2021, at 22:21, Konstantin Belousov <kib@FreeBSD.org> wrote: >>>>>>>> >>>>>>>> The branch main has been updated by kib: >>>>>>>> >>>>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=a4e4132fa3bfadb6047fc0fa5f399f4640460300 >>>>>>>> >>>>>>>> commit a4e4132fa3bfadb6047fc0fa5f399f4640460300 >>>>>>>> Author: Konstantin Belousov <kib@FreeBSD.org> >>>>>>>> AuthorDate: 2021-11-29 16:26:31 +0000 >>>>>>>> Commit: Konstantin Belousov <kib@FreeBSD.org> >>>>>>>> CommitDate: 2021-12-04 22:20:58 +0000 >>>>>>>> >>>>>>>> swapoff(2): replace special device name argument with a structure >>>>>>>> >>>>>>>> For compatibility, add a placeholder pointer to the start of the >>>>>>>> added struct swapoff_new_args, and use it to distinguish old vs. new >>>>>>>> style of syscall invocation. >>>>>>>> >>>>>>>> Reviewed by: markj >>>>>>>> Discussed with: alc >>>>>>>> Sponsored by: The FreeBSD Foundation >>>>>>>> MFC after: 1 week >>>>>>>> Differential revision: https://reviews.freebsd.org/D33165 >>>>>>>> --- >>>>>>>> sys/vm/swap_pager.c | 27 +++++++++++++++++++++++++-- >>>>>>>> sys/vm/swap_pager.h | 8 ++++++++ >>>>>>>> 2 files changed, 33 insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c >>>>>>>> index 165373d1b527..dc1df79f4fcd 100644 >>>>>>>> --- a/sys/vm/swap_pager.c >>>>>>>> +++ b/sys/vm/swap_pager.c >>>>>>>> @@ -2491,15 +2491,38 @@ sys_swapoff(struct thread *td, struct swapoff_args *uap) >>>>>>>> struct vnode *vp; >>>>>>>> struct nameidata nd; >>>>>>>> struct swdevt *sp; >>>>>>>> - int error; >>>>>>>> + struct swapoff_new_args sa; >>>>>>>> + int error, probe_byte; >>>>>>>> >>>>>>>> error = priv_check(td, PRIV_SWAPOFF); >>>>>>>> if (error) >>>>>>>> return (error); >>>>>>>> >>>>>>>> + /* >>>>>>>> + * Detect old vs. new-style swapoff(2) syscall. The first >>>>>>>> + * pointer in the memory pointed to by uap->name is NULL for >>>>>>>> + * the new variant. >>>>>>>> + */ >>>>>>>> + probe_byte = fubyte(uap->name); >>>>>>>> + switch (probe_byte) { >>>>>>>> + case -1: >>>>>>>> + return (EFAULT); >>>>>>>> + case 0: >>>>>>>> + error = copyin(uap->name, &sa, sizeof(sa)); >>>>>>>> + if (error != 0) >>>>>>>> + return (error); >>>>>>>> + if (sa.flags != 0) >>>>>>>> + return (EINVAL); >>>>>>>> + break; >>>>>>>> + default: >>>>>>>> + bzero(&sa, sizeof(sa)); >>>>>>>> + sa.name = uap->name; >>>>>>>> + break; >>>>>>>> + } >>>>>>> >>>>>>> Doesn’t this change the semantics of swapoff("")? >>>>>>> >>>>>>> Previously it would fail deterministically, presumably with ENOENT or >>>>>>> something, but now it reinterprets whatever follows that string in >>>>>>> memory as the new argument structure. It probably doesn’t matter, but >>>>>>> this approach is ugly. Can we not just define a new syscall rather than >>>>>>> this kind of bodge? >>>>>> >>>>>> Having two swapoff() syscalls is worse, and having them only differ in >>>>>> semantic by single flag is kind of crime. >>>>>> >>>>>> I do not see swapoff("") as problematic, we are changing a minor semantic of >>>>>> the management syscall. I only wanted to avoid flag day for swapoff binaries. >>>>>> >>>>>> BTW, I considered requiring proper alignment for uap->name, and then checking >>>>>> the whole uap->name_old_syscall for NULL, but then decided that this is >>>>>> overkill. If you think that swapoff("") that important, I can add that >>>>>> additional verification. >>>>> >>>>> Why’s it worse? It’s just a syscall number, you deprecate the old one >>>>> and move on, we do that for things relatively regularly. This is really >>>>> not a good solution; harder to use as a caller since the prototype is >>>>> wrong, impossible to ensure you preserve the semantics for the existing >>>>> interface in all cases, and ugly to implement. You don’t need a flag >>>>> day for a new syscall, either, you can continue to only use the new >>>>> method for -f for a release and then switch over to the new syscall >>>>> entirely. Or switch over to the new syscall entirely now and fall back >>>>> on the old syscall if -f isn’t passed. Defining a new syscall also lets >>>>> you not need the name_old_syscall member in the struct, and gives you a >>>>> clean, fully-extensible syscall to which future features can be added >>>>> in a backwards-compatible way, rather than forever keeping around this >>>>> legacy mess. >>>> >>>> I disagree, it is not just a syscall number, it is whole user/kernel >>>> interface that bloats, which means cognitive efforts from anybody using >>>> this interfaces, and for which we must maintain ABI compatibility. >>> >>> Which is just as true of this approach; you have the same two >>> interfaces here, just smashed together into a single harder-to-use >>> syscall rather than two separate syscalls. Having a separate syscall at >>> least allows the old one to return ENOSYS in the future, whereas if you >>> ever want to deprecate the old interface with this method then you’ll >>> need some other weird error response that’s harder to interpret as >>> meaning “that variant of this syscall doesn’t exist any more”. >>> >>>> New syscall allocation should be done only as a last resort, when existing >>>> interfaces cannot be adopted for new functionality. >>> >>> Which this can’t without breaking the existing well-defined semantics, >>> as I’ve stated. >>> >>>> Good (or rather, bad) example of the uglyness that is backed by the attitude >>>> that syscalls are free, is whole *at() mess, or specific stat*() mess (old, >>>> other bsds, pre-ino64, ino64, at, then stat vs fstat, then Linux statx which >>>> probably fixes the interface ultimately). >>> >>> It’s better than this approach. >> >> I have less resistance to adding new syscalls, and I agree with Jess that it >> is the right thing to do in this case. Adding a syscall means the kernel >> supports either old or new interface, so there is no flag day. And it is >> easier to clean up; maintaining two syscalls should only be needed for a >> while on -current, and not in any release. > No, we do not do this. We maintain backward compatibility, old swapoff(2) > would have to live under COMPAT_FREEBSD12 forever. And that’s fine, it’ll remain its own self-contained bit of code under COMPAT_FREEBSD12 that wraps the new syscall, not mixed in with the still-current syscall. Jess