Re: Complicated interactions between O_EXEC, fdescfs, fexecve, and shebangs
Date: Wed, 03 Nov 2021 13:35:40 UTC
On Wed, Nov 03, 2021 at 12:30:26PM +0100, Drew DeVault wrote: > Note: I am not subscribed to this list, please use reply-all to keep me > on the thread. Thanks! > > $ uname -a > FreeBSD megumin 13.0-RELEASE FreeBSD 13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:24:09 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 > > This problem starts with the following program: > > #include <fcntl.h> > #include <unistd.h> > > extern char **environ; > > int main(void) { > int fd = open("./test.sh", O_EXEC); > char *argv[] = { > "./test.sh", > NULL > }; > fexecve(fd, argv, environ); > } > > Given this test.sh, which is executable: > > #!/bin/sh > echo hello world > > This program produces the following error: > > /bin/sh: cannot open /dev/fd/3: Permission denied > > The program works fine with O_RDONLY instead, which makes some sense. > The way this works is that the kernel rewrites argv to {"/bin/sh", > "/dev/fd/%d"}, where %d is the file descriptor passed to fexecat. The > interpreter then has to open this file for reading, so it needs the read > bit set. fdescfs preserves the permissions of the file descriptor which > was originally opened, so the read bit is missing with O_EXEC. Q.E.D. > > The fix is to set O_RDONLY and mount fdescfs. If nothing else comes of > this, I would like to request that FreeBSD consider mounting fdescfs by > default, so that fexecve can be reliably expected to work correctly with > interpreters. Otherwise, the value proposition of fexecve is severely > limited. > > However, a few other problems came up while looking into this. > > The investigation was made more difficult by the fact that open(2) is > documented in the man page as producing EINVAL when O_EXEC is combined > with O_RDONLY, but this is not so: no error occurs. This is because > O_RDONLY is, in fact, not a bit: it is zero. You cannot NOT provide > O_RDONLY to an open call. RhodiumToad on #freebsd IRC gave a possible > improvement for the man page: > > > Only one of O_EXEC, O_RDWR and O_WRONLY may be specified. > > The other issue is that this essentially makes O_EXEC useless outside of > some specific cases, where the user knows for certain that the file > being executed is not a script. The combination of O_EXEC and fexecve > cannot generalize to support all use-cases of execve, which is > frustrating because my code either (A) cannot be TOCTOU or (B) needs > some awful special cases. Even in case (B), it would not generalize to > the case where I have execute, but not read, permission for a script, > but the interpreter has both. > > I'm not sure what the answer for any of this is. > > By way of contrast, Linux solves this problem a bit differently. It does > not have O_EXEC, but it does have O_PATH, which opens a file descriptor > without read, write, OR execute, but simply to keep track of an inode > reference. fexecve on Linux then uses a similar /dev/fd trick, but the > file in /dev/fd has no mode bits set and I'm not sure why it works. FreeBSD also has O_PATH. There are two differences with Linux: - FreeBSD requires O_EXEC to be specified together with O_PATH, if intent is to use the resulting file descriptor with fexecve(2). In fact this can be removed, see https://reviews.freebsd.org/D32821 - Semantic of the FreeBSD fdescfs open(2) is different, to get the behavior similar to Linux, you need to specify "nodup" mount option, see fdescfs(5).