Complicated interactions between O_EXEC, fdescfs, fexecve, and shebangs
Date: Wed, 03 Nov 2021 11:30:26 UTC
Note: I am not subscribed to this list, please use reply-all to keep me on the thread. Thanks! $ uname -a FreeBSD megumin 13.0-RELEASE FreeBSD 13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:24:09 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 This problem starts with the following program: #include <fcntl.h> #include <unistd.h> extern char **environ; int main(void) { int fd = open("./test.sh", O_EXEC); char *argv[] = { "./test.sh", NULL }; fexecve(fd, argv, environ); } Given this test.sh, which is executable: #!/bin/sh echo hello world This program produces the following error: /bin/sh: cannot open /dev/fd/3: Permission denied The program works fine with O_RDONLY instead, which makes some sense. The way this works is that the kernel rewrites argv to {"/bin/sh", "/dev/fd/%d"}, where %d is the file descriptor passed to fexecat. The interpreter then has to open this file for reading, so it needs the read bit set. fdescfs preserves the permissions of the file descriptor which was originally opened, so the read bit is missing with O_EXEC. Q.E.D. The fix is to set O_RDONLY and mount fdescfs. If nothing else comes of this, I would like to request that FreeBSD consider mounting fdescfs by default, so that fexecve can be reliably expected to work correctly with interpreters. Otherwise, the value proposition of fexecve is severely limited. However, a few other problems came up while looking into this. The investigation was made more difficult by the fact that open(2) is documented in the man page as producing EINVAL when O_EXEC is combined with O_RDONLY, but this is not so: no error occurs. This is because O_RDONLY is, in fact, not a bit: it is zero. You cannot NOT provide O_RDONLY to an open call. RhodiumToad on #freebsd IRC gave a possible improvement for the man page: > Only one of O_EXEC, O_RDWR and O_WRONLY may be specified. The other issue is that this essentially makes O_EXEC useless outside of some specific cases, where the user knows for certain that the file being executed is not a script. The combination of O_EXEC and fexecve cannot generalize to support all use-cases of execve, which is frustrating because my code either (A) cannot be TOCTOU or (B) needs some awful special cases. Even in case (B), it would not generalize to the case where I have execute, but not read, permission for a script, but the interpreter has both. I'm not sure what the answer for any of this is. By way of contrast, Linux solves this problem a bit differently. It does not have O_EXEC, but it does have O_PATH, which opens a file descriptor without read, write, OR execute, but simply to keep track of an inode reference. fexecve on Linux then uses a similar /dev/fd trick, but the file in /dev/fd has no mode bits set and I'm not sure why it works.