Capabilities/privileges and bounding sets

Mon Aug 28 13:04:03 GMT 2000

On Sun, 27 Aug 2000, Andrew Morgan wrote:

> Robert Watson wrote:
> > Addition of new flag: CAP_BOUND
> > 
> >   Inheritence rules:
> >         pI' = pI & ~(pB')
> >         pP' = (fP` | (fI & pI)) & ~(pB')
> >         pE' = (fE & pE) & ~(pB')
> >         pB' = pB | fB
> 
> Your rules do not appear to be based on those of DS17. DS17 has:
> 
>        pI' = pI
>        pP' = (fP & X) | (fI & pI)
>        pE' = (fE & pP)

Yup, I typo'd pE', which should read:

        pE' = (fE & pP) & ~(pB')

Rather than using pE.

> For bounding sets, I'm using the 'X' in DS17 implemented as:
> 
>        X = fB & (~pB)
> 
> fB is a per-filesystem-mount bound, ~pB is a per process bound: pB' =
> pB. [I'm trying to use your notation for pB - namely the '~' of the
> mask. Although elsewhere I'm likely to use another notation.]
> 
> The decision about whether to leak (pI & pB) bits through process
> execution is entirely up to the program that restricts ~pB and then does
> an exec(). Namely, one would expect such a program to force (pI & pB) =
> 0, but if there is a reason this isn't appropriate in individual cases,
> then said program wouldn't be required to do it.

I considered this as an option, but as rationale for not doing that, I
should describe how jail() affects process privileges in FreeBSD, and why
that behavior is desirable.

When a sufficiently privileged process calls jail() in FreeBSD, a number
of properties of that process are modified:

	1) chroot() is invoked as appropriate to set the process root
	2) Access to network resources is bounded to a single IP
	   address (should be implemented using MAC)
	3) Access to other processes is bounded to those within the
	   same jail (should be implemented using MAC)
	4) Most invocations of suser() now fail with EPERM, except
	   specifically selected ones where the PRISON_ROOT flag is
	   set in the suser() invocation, indicating that it is safe
	   within jail(), subject to the MAC bounds
	5) jail() cannot be called recursively

Within a jail(), setuid binaries work fine (even creation of new ones),
and most suser() calls succeed: binding of privileged ports (subject to IP
restrictions), et al.  Everything but calls that would enable leaving the
jail().  My hypothesis has been that this can be reduced to capability
bounding sets: rather than scattering access control decisions about jail
interaction throughout the source tree, a call to the existing jail()
syscall would simply set an inheritted process bound.

However, in order to maintain compatibility, it is desirable that
applications making use of capabilities, although not the ones prohibited
by the bound, function normally.  They should be able to introduce new
applications with capabilities set on the binaries, twiddle their
capabilities happily, subject to the normal restrictions.  I.e., they
should function exactly as outside the jail() as long as they don't bump
into its limits.

Hence the desire to place a strict bound on pP', rather than have it only
affect those capabilities gained through fP -- the bit set on the file
system should not result in the bit being set in the process.  However,
I'm interested in your idea of simply blocking process execution in the
event that a bit is set on the binary that is not permitted in the jail().
Presumably a combination of preventing the bit from being set in the
jail() (or maybe even allowing it to be set on the binary, just not
permitting the binary to run?) would provide, in effect, the same result.

> The fI value seems to need a mask associated with it too. I'd probably
> prefer to map fI on the disc to (fI & fB) in system. [This is arguably
> the only extension beyond DS17.]

I think this is to some extent what I describe above, but if not, I'm not
sure I follow.  Could you go into a bit more detail or provide more
context?

> >   For the purposes of enforcement:
> >         pE & ~(pB)
> 
> For Linux, I'm pursuing this restriction on pP':
> 
>     (pP' & fP) == fP
> 
> Namely, I interpret fP to be the minimum capabities that the program
> requires to function correctly, so if any of those capabilities are not
> available, then the file cannot be executed.

Sounds good, although this allows a binary to have unhappy capabilities
set on it, but only to discover that at execution time.  However, it has
other nice properties, including handling the jail()'s bound changing
across "reboot", et al.

> >   When cap_set_proc() is called, an attempt to enable any of the bits true
> >   in the bound results in EPERM.
> 
> Why is this explicit restriction needed? Can't you rely on the exec()
> rules?

Sounds reasonable.

> >   For files with no capabilities set, fB may be assumed to be all 0's,
> >   meaning that it introduces no new bounding.  If fB is not supported in
> >   the file system, the same is assumed.
> > 
> >   With appropriate privilege (CAP_SETPCAP), the contents of pB may be
> >   modified, otherwise EPERM.  A design choice might also be whether or
> >   not CAP_SETPCAP would allow the removal of bounding on capabilities:
> >   probably not, only the increased bounding.
> 
> Linux will probably be closer to SGI here, namely CAP_SETPCAP will
> remove the pP bound on raising new capabilities in pI. [We've had some
> debate about whether to make CAP_SETPCAP wholely equivalent to eip=~0
> (which I understand would make us completely compatible with SGI), but
> especially for chroot jail type things this looks insufficiently
> restrictive.]

Hmm.  My only concern here would be: my reading of POSIX.1e didn't
distinguish CAP_SETPCAP from any other capability in terms of properties,
so I assumed that revoking CAP_SETPCAP would not impact the presence of
other capabilities.  If CAP_SETPCAP = ~0, wouldn't revoking CAP_SETPCAP
result in revoking all capabilities?  Perhaps I'm reading this wrong.

> >   Open question: when an application calls cap_set_proc() with a
> >   capability set with B set all zeros, and E,I, and P don't violate the pB
> >   of the process, should it EPERM, or succeed, but not set the bound?
> 
> I'm not sure what this means.

Suppose the application calls cap_new(), and copies over the EIP fields
from its old capability set, subject to certain limitations (or more
likely, it creates the capability set precisely as it thinks it needs),
but doesn't understand the CAP_BOUND (B) set.  It will not set these flags
in the B set, but attempt to call cap_set_proc() anyway.  The question
then becomes, when cap_set_proc() is called, to what extent do you pay
attention to the contents of the B set in the following situations (r ==
request)

rB (superset) pB
rB == pB
rB (subset) pB
rB == 0

If you assume all applications must be aware of CAP_BOUND, i.e., we assume
that everyone wanting portability will implement CAP_BOUND in the above
copying/creation operation, you'd expect to accept only (rB == pB), except
when appropriate privilege is available to {increase, decrease} the bound.

My question was: if (rB == 0) and appropriate privilege isn't present, can
we just ignore the request to change pB to 0, allowing compatibility with
applications that don't understand how to set rB correctly.

Assuming CAP_BOUND awareness is the best choice from a consistency
perspective in the API, but if FreeBSD is the only platform that
implements this set of bounding semantics, I'd rather not.  :-)  That
said, I hope to implement "what everyone else implements", I'm describing
how I see CAP_BOUND being used to create jail()'s in a manner consistent
with our current (successfull) jail() implementation, in the hopes of
influencing a bound concensus in that direction.

The aspect that is particular relevant to jail() is simply the strict
monotonic inheritence property: exec'ing a binary which would result in
more privilege than the bound permits, should, as you describe, fail.
However, other capability manipulation can and should be fine, as well as
creation of binaries with capabilities (their usage bound by the
inheritted process bound, in practice).

> > The open question has to do with whether or not applications will
> > typically be aware of CAP_BOUND or not.  It's not mentioned in the draft,
> > so perhaps it's better to assume that an application might not be.
> 
> We'll probably extend the 'posix' API to read and write the two flavors
> of bound. That is, make them visible. How has not been decided.

Sorry -- two flavors?  I saw fB from the file, and potentionally an fB
from the file system.  What was the other flavor of bound you had in mind?

The most consistent way would seem to be to introduce a CAP_BOUND as an
addition to CAP_EFFECTIVE, et al.  I hadn't originally planned to allow a
CAP_BOUND entry on a binary, but that probably would make a good way to
handle the jail entry point.

In today's jail(), recursive jailing is not permitted; in a
capability-model, presumably it could be if CAP_CHROOT, et al, were
present.  Unlike in the setuid model, capability-enabled binaries would
presumably be more aware of whether or not they had the privileges
required for an operation (i.e., sendmail bugs, et al).

On the other hand, I'd definitely like it to be the case that it were
possible to prevent further modification of CAP_BOUND by children to
prevent applications from having to deal with it...

  Robert N M Watson 

robert at fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services

To Unsubscribe: send mail to majordomo at cyrus.watson.org
with "unsubscribe posix1e" in the body of the message