Proposal: a revoke() system call
Robert Watson
rwatson at FreeBSD.org
Tue Jul 8 15:26:23 UTC 2008
On Tue, 8 Jul 2008, Robert Watson wrote:
> Which can be accomplished by calling dup2(2) to replace the file descriptor
> with another file descriptor, perhaps one to /dev/null. It would be worth
> carefully reviewing the implementation of dup2(2) to make sure that the
> close->replace there is atomic with respect to other threads simultaneously
> allocating file descriptors, such as with pipe(2).
BTW, on a similar note to the above: I've noticed there are several spots of
relative non-atomicity in the Linux emulation code, where rather than just
wrapping existing system calls with binary conversion of arguments and return
values, we do a semantic wrapping that is necessarily non-atomic with respect
to the native code. For example, consider the Linuxulator open code in
linux_common_open():
134 error = kern_openat(td, dirfd, path, UIO_SYSSPACE, bsd_flags, mode);
135
136 if (!error) {
137 fd = td->td_retval[0];
138 /*
139 * XXX In between kern_open() and fget(), another process
140 * having the same filedesc could use that fd without
141 * checking below.
142 */
143 error = fget(td, fd, &fp);
144 if (!error) {
145 sx_slock(&proctree_lock);
146 PROC_LOCK(p);
147 if (!(bsd_flags & O_NOCTTY) &&
148 SESS_LEADER(p) && !(p->p_flag & P_CONTROLT)) {
149 PROC_UNLOCK(p);
150 sx_unlock(&proctree_lock);
151 if (fp->f_type == DTYPE_VNODE)
152 (void) fo_ioctl(fp, TIOCSCTTY,
(caddr_t) 0,
153 td->td_ucred, td);
154 } else {
155 PROC_UNLOCK(p);
156 sx_sunlock(&proctree_lock);
157 }
158 if (l_flags & LINUX_O_DIRECTORY) {
159 if (fp->f_type != DTYPE_VNODE ||
160 fp->f_vnode->v_type != VDIR) {
161 error = ENOTDIR;
162 }
163 }
164 fdrop(fp, td);
165 /*
166 * XXX as above, fdrop()/kern_close() pair is racy.
167 */
168 if (error)
169 kern_close(td, fd);
170 }
171 }
I think that comment is mine, or at least, got there because of a comment I
made to Roman or the like. The fd has not yet been explicitly returned to
userspace, since the open system call hasn't actually returned, but other
threads could use the file descriptor in a system call that could lead to
unexpected races. For example, if you dup2() on top of the file descriptor
between the return of kern_openat() and the invocation of fget(), fo_ioctl()
might be called on the wrong file, or the kern_close() in the error case might
get invoked on the "wrong" file descriptor.
In these cases, the races are mostly harmless since they involve incorrectly
using a file descriptor from a second thread -- since it hasn't been returned
yet, it isn't valid yet and the results will be undefined. However, there may
well be cases where similar races exist that do affect the semantics of
multi-threaded Linux applications, such as having a main event (open()) and an
associated event (fo_ioctl()) be non-atomic and allowing a race between them
that does have a semantically problematic result.
These sorts of edge cases, btw, are one reason why I would *strongly*
discourage application writers from doing things like calling close(2) on a
file descriptor while still using it from another thread. :-)
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-arch
mailing list