Marking select(2) as restrict

Wed Feb 28 09:40:50 UTC 2018

On Tue, 27 Feb 2018 14:30:24 -0800 (PST)
Chris Torek <torek at elf.torek.net> wrote:

> If we can back up a bit, there are (at least) two mostly-separate
> issues:
> 
>  * What types and qualfiiers to use for various parameters:
>    These are dictated by standards, and we should just conform
>    to whatever is in the standard unless there's some major
>    reason to do otherwise.
> 
>    POSIX says that we *should* use "restrict" here:
>    http://pubs.opengroup.org/onlinepubs/009696699/functions/pselect.html
> 
>    so if no one has a major reason to do otherwise (such as
>    "no conforming program can tell that we didn't and all it will
>    achieve is nothing" :-) ) I'd say add the restrict.
> 
>    (As kib and others have noted, it achieves a lot of nothing, so
>    the priority for this particular conformance seems low.)
> 
>  * Select itself: it makes no sense *in the caller* to pass
>    the same &fd_set argument to more than one parameter unless
>    they are all input-only.  The kernel is going to write on them,
>    and there is no guarantee that the kernel will write them in any
>    particular order.  Hence if you write:
> 
>         ret = select(nfds, &a, &a, &a, timeout);
> 
>    you are nearly guaranteed to have a bug: this call can only
>    be correct if the only value you intend to inspect is "ret".
> 
>    (That, in fact, is what devd was doing: if ret==0, daemonize,
>    and in any case throw away the returned results in the by-pointer
>    arguments.  Hence not a bug -- unless ... well, read on.)
> 
>    The remaining question is: can even that be correct?  In
>    particular, should the kernel be required to use copy-in/copy-out
>    semantics, or can it do the equivalent (minus error checking
>    and a lot of other important details) of:
> 
>         int select(int nfds, fd_set *in, fd_set *out, fd_set *ex,
>             struct timeval *tvp) {
>                 some_complicated_type v_in, v_out, v_ex;
> 
>                 v1 = op(*in);
>                 *in = function(v1);
>                 v2 = op(*out);
>                 ...
>         }
> 
>    ?  If the latter is *permitted*, then the call passing "&a"
>    three times is *always* wrong, since *in might have been
>    clobbered before *out is ever loaded.  If a program breaks
>    because it was doing this, well, it was already broken, we were
>    just lucky that it worked anyway.  (Was this good luck, or bad
>    luck? :-) )
> 
>    Currently, the kernel does in fact copy them all in,
>    then futz with them, then copy them all out (for good reasons).
>    So the devd usage always works.  But will the kernel *always*
>    do this, or is that an accident of the implementation?
> 

Linux also does a copyin (copy_from_user) of each FD_SET.  Linux
also does not use restrict in select.  It would be a very bad
design decision to not copy each FD_SET separately into the
kernel.

>    If the kernel *is* always going to do this, the documentation
>    should say so, at which point the POSIX requirement for the
>    restrict on the parameter is itself a bug (as far as BSD is
>    concerned).  That would call for feature test macros around
>    the declaration.
> 
> So, if we declare that select(nfds, &a, &a, &a, timeout) *is*
> legal and meaningful in BSD, *don't* just put in the restrict:
> use feature test macros as appropriate to handle any conformance
> issues.  But please also update select(2) to call out when this
> is OK (e.g., in devd's particular use case).
> 

It would be good to explicitly state that it is only safe to
use pointers to a single FD_SET when only the error return is of
interest and the results in the FD_SET will be ignored.

-- 
Gary Jennejohn