[capsicum] unlinkfd
Mariusz Zaborski
oshogbo at FreeBSD.org
Sat Mar 3 18:43:53 UTC 2018
I feel that there is two different things we can think about:
- What we would implement in the capability system if we would build it from
scratch. Here shm_open(2) and SHM_ANON can be solution to our problems.
- On the other hand we have a working operating system and we can't expect that
all our programs that are already implemented will fit to those assumptions
nor ask developers to rewrite many existing programs.
On Sat, Mar 03, 2018 at 05:16:38PM +0000, Robert N. M. Watson wrote:
> New _check() variants of the unlinkat(2) and rmdirat(2) system calls might do the trick -- e.g.,
>
> int unlinkat_check(dirfd, name, checkfd);
> int rmdirat_check(dirfd, name, checkfd);
>
Similar API was proposed on the review. This solves the issue with RC.
Unfortunately it's not solve the problem with guessing in which directory we
will work in.
When I think about sandboxing for example rm(1) we would need to preopen root
directory, or preopen all directories we will work in. Both solution just don't
feel right.
I'm not saying that the unlinkfd is the right and only solution - I'm just trying
to solve problem we identified while sandboxing apps. I'm glad we started this
discussion I hope we will work some compromise between all presented challenges.
Thanks,
--
Mariusz Zaborski
oshogbo//vx | http://oshogbo.vexillium.org
FreeBSD commiter | https://freebsd.org
Software developer | http://wheelsystems.com
If it's not broken, let's fix it till it is!!1
> The calls would succeed only if 'name' refers to the filesystem object passed via checkfd. This would retain UNIX-style directory behaviour but allows an atomic check that the object is as expected.
>
> Of course, what you do about it if it turns out the check fails is another question... Better not to have a name at all, hence shm_open(SHM_ANON, ...) -- although just for file objects, and not directory hierarchies.
>
> Robert
>
> > On 3 Mar 2018, at 15:29, Alan Somers <asomers at freebsd.org> wrote:
> >
> > In fact, FreeBSD has that same unlinkat(2) system call. But it doesn't solve Mariusz's problem. He's concerned about race conditions. With either unlink(2) or unlinkat(2), there's no way to ensure that the directory entry you remove is for the file you think it is. Because after reading/writing a file and before unlinking it, some other processes could've unlinked it and created a new one with the same name. It's this race condition that Mariuz seeks to solve with unlinkfd.
> > -Alan
> >
> > On Sat, Mar 3, 2018 at 5:46 AM, Alexander Richardson <Alexander.Richardson at cl.cam.ac.uk <mailto:Alexander.Richardson at cl.cam.ac.uk>> wrote:
> > Linux has a unlinkat() system call (https://linux.die.net/man/2/unlinkat <https://linux.die.net/man/2/unlinkat>)
> > but it doesn't seem to have a flag that lets you unlink the fd itself.
> > Possibly pathname == NULL and AT_EMPTY_PATH could mean unlink the fd but I
> > haven't tried whether that works.
> > It also has a AT_REMOVEDIR flag to make it function as rmdirat().
> >
> > On 3 March 2018 at 10:41, Robert N. M. Watson <robert.watson at cl.cam.ac.uk <mailto:robert.watson at cl.cam.ac.uk>>
> > wrote:
> >
> > > FWIW, this is part of why we introduced anonymous POSIX shared memory
> > > objects with Capsicum in FreeBSD -- we allow shm_open(2) to be passed a
> > > SHM_ANON special name, which causes the creation of a swap-backed, mappable
> > > file-like object that can have I/O, memory mapping, etc, performed on it ..
> > > but never has any persistent state across reboots even in the event of a
> > > crash.
> > >
> > > With Capsicum you can then refine a file descriptor to the otherwise
> > > writable object to be read-only for the purposes of delegation. There is
> > > not, however, a mechanism to "freeze" the state of the object causing other
> > > outstanding writable descriptors to become read-only -- certainly something
> > > could be added, but some care regarding VM semantics would be required --
> > > in particular, so that faults could not be experienced as a result of an
> > > memory store performed before the "freeze" but issued to VFS only later.
> > >
> > > I certainly have no objection to an unlinkat(2) system call -- it's
> > > unfortunate that a full suite of the at(2) APIs wasn't introduced in the
> > > first place. It would be worth checking that no one else (e.g., Solaris,
> > > Mac OS X, Linux) hasn't already added an unlinkat(2) that we can match API
> > > semantics for. I think I take the view that for truly anonymous objects,
> > > shm_open(2) without a name (or the Linux equiv) is the right thing -- and
> > > hence unlinkat(2) is for more conventional use cases where the final
> > > pathname element is known.
> > >
> > > On directories: There, I find myself falling back on a Casper-like
> > > service, since GC'ing a single anonymous memory object is straightforward,
> > > but GC'ing a directory hierarchy is a more messy business.
> > >
> > > Robert
> > >
> > > > On 3 Mar 2018, at 09:53, Justin Cormack <justin at specialbusservice.com <mailto:justin at specialbusservice.com>>
> > > wrote:
> > > >
> > > > I think it would make sense to have an unlinkfd() that unlinks the file
> > > from
> > > > everywhere, so it does not need a name to be specified. This might be
> > > > hard to implement.
> > > >
> > > > For temporary files, I really like Linux memfd_create(2) that opens an
> > > anonymous
> > > > file without a name. This semantics is really useful. (Linux memfd also
> > > has
> > > > additional options for sealing the file fo make it immutable which are
> > > very
> > > > useful for safely passing files between processes.) Having a way to make
> > > > unnamed temporary files solves a lot of deletion issues as the file
> > > > never needs to
> > > > be unlinked.
> > > >
> > > >
> > > > On 2 March 2018 at 18:35, Mariusz Zaborski <oshogbo at freebsd.org <mailto:oshogbo at freebsd.org>> wrote:
> > > >> Hello,
> > > >>
> > > >> Today I would like to propose a new syscall called unlinkfd(2) which
> > > came up
> > > >> during a discussion with Ed Maste.
> > > >>
> > > >> Currently in UNIX we can’t remove files safely. If we will try to do so
> > > we
> > > >> always end up in a race condition. For example when we open a file, and
> > > check
> > > >> it with fstat, etc. then we want to unlink(2) it… but the file we are
> > > trying to
> > > >> unlink could be a different one than the one we were fstating just a
> > > moment ago.
> > > >>
> > > >> Another reason of implementing unlinkfd(2) came to us when we were
> > > trying
> > > >> to sandbox some applications like: uudecode/b64decode or bspatch. It
> > > occured
> > > >> to us that we don’t have a good way of removing single files. Of course
> > > we can
> > > >> try to determine in which directory we are in, and then open this
> > > directory and
> > > >> remove a single file.
> > > >>
> > > >> It looks even more bizarre if we would think about a program which
> > > operates on
> > > >> multiple files. If we would analyze a situation with two totally
> > > different
> > > >> directories like `/tmp` and `/home/oshogbo` we would end up with pre
> > > opening
> > > >> a root directory or keeping as many directories as we are working on
> > > open.
> > > >> All of that effort only to remove two files. This make it totally
> > > impractical!
> > > >>
> > > >> I think that opening directories also presents some wider attack vector
> > > because
> > > >> we are keeping a single descriptor to a directory only to remove one
> > > file.
> > > >> Unfortunately this means that an attacker can remove all files in that
> > > directory.
> > > >>
> > > >> I proposed this as well on the last Capsicum call. There was a
> > > suggestion that
> > > >> instead of doing a single syscall maybe we should have a Casper service
> > > that
> > > >> will allow us to remove files. Another idea was that we should perhaps
> > > redesign
> > > >> programs to create some subdirs work on the subdirs and then remove all
> > > files in
> > > >> this subdir. I don’t feel that creating a Casper service is a good idea
> > > because
> > > >> we still have exactly the same issue of race condition. In my opinion
> > > creating
> > > >> subdirs is also a problem for us.
> > > >>
> > > >> First we would need to redesign some of our tools and I think we should
> > > >> simplyfiy capsicumizition of the process instead of making it harder.
> > > >>
> > > >> Secondly we can create a temporary subdirectory but what will remove it?
> > > >> We are going back to having a fd to directory in which we just created
> > > a subdir.
> > > >> Another way would be to have Casper service which would remove a
> > > directory but
> > > >> with the risk of RC.
> > > >>
> > > >> In conclusion, I think we need syscall like unlinkfd(2), which turn out
> > > taht it
> > > >> is easy to implement. The only downside of this implementation is that
> > > we not
> > > >> only need to provide a fd but also a path file. This is because inodes
> > > nor
> > > >> vnodes don’t contain filenames. We are comparing vnodes of the fd and
> > > the given
> > > >> path, if they are exactly the same we remove a file. In the syscall we
> > > are using
> > > >> a fd so there is no Ambient Authority because we are proving that we
> > > already
> > > >> have access to that file. Thanks to that the syscall can be safely used
> > > with
> > > >> Caspsicum. I have already discussed this with some people and they said
> > > >> `Hey I already had that idea a while ago…` so let’s do something with
> > > that idea!
> > > >> If you are intereted in patch you can find it here:
> > > >> https://reviews.freebsd.org/D14567 <https://reviews.freebsd.org/D14567>
> > > >>
> > > >> Thanks,
> > > >> --
> > > >> Mariusz Zaborski
> > > >> oshogbo//vx | http://oshogbo.vexillium.org <http://oshogbo.vexillium.org/>
> > > >> FreeBSD commiter | https://freebsd.org <https://freebsd.org/>
> > > >> Software developer | http://wheelsystems.com <http://wheelsystems.com/>
> > > >> If it's not broken, let's fix it till it is!!1
> > > >
> > >
> > >
> > >
> > _______________________________________________
> > freebsd-hackers at freebsd.org <mailto:freebsd-hackers at freebsd.org> mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org <mailto:freebsd-hackers-unsubscribe at freebsd.org>"
> >
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20180303/34e85482/attachment.sig>
More information about the freebsd-hackers
mailing list