script(2) [was: [CFT/review] new sendfile(2)]
Jordan Hubbard
jkh at mail.turbofuzz.com
Mon Sep 1 20:57:17 UTC 2014
On Aug 31, 2014, at 3:05 PM, Poul-Henning Kamp <phk at phk.freebsd.dk> wrote:
> Can I inject an old idea whose time may finally have arrived ?
> [ … ]
> Imagine we instead define a byte-code-engine which interprets a
> string of commands, sort of like the pcap filtering engine already
> does. The corresponding syscall would be "follow_the_script(2)"
Having seen this pattern used for several kernel-related things in a few of my former lives, I think this idea has a lot of merit, though I’d be careful not to conceptualize it purely (or only) as an “engine for off-loading work to in order to avoid the kernel/userland boundary cost” since I think the concept has a much broader application than that. It can also obviously be used for match filters (for the packet capture example already given) or security policies (firewalling, sandboxing) that are in the kernel simply because that’s the most logical place to put them, and that means that the “script” may be a full-on complex task or a really short little script fragment (scriptlet?) which potentially needs access to a lot more of the kernel than the file primitives. If it’s a firewall related task, obviously it wants to be able to interpose itself into the networking path. If it’s a sandbox rule script, it’s going to need to be able to gate access to a wide variety of kernel services (not unlike all the checks that phk added for jails). Perhaps that’s what phk meant and I’m just reading his original message too narrowly.
That’s also why I think the rubber will most meet the road in figuring out just how many “bytecode primitives” to surface, a far more bike-sheddy topic than the actual higher-level description format, though we also have plenty of empirical evidence to suggest that the MAC hook mechanism in TrustedBSD already pretty much describes all of the logical places to place the hooks and therefore also suggests what the full enumeration of bytecode primitives might look like. If TrustedBSD added a hook point, consider creating a corresponding primitive which can act on the corresponding subject/target at that point and boom, there’s your trail of breadcrumbs to follow.
I would also add a corresponding DFA engine for acting on paths, since I think that’s a necessary sub-component of the bytecode engine. Unix is path oriented. Allow all of the relevant primitives which act on files to have a DFA for matching the ones it applies to and you’ve really got something pretty powerful.
When we implemented application sandboxing in OS X and iOS, we chose to use Scheme as the implementation language (see /usr/share/sandbox on any OS X system for a good selection of examples) and a “sandbox compiler” process to turn that (and the regex DFAs) into bytecode, but we could have honestly chosen almost any scripting language so I really don’t think this discussion needs to get too hung up on the selection of a higher-level language. You want Lua? Sure. Just make it a “rule” that the kernel itself doesn’t have to know beans about Lua and some userland agent or library will turn the Lua code into the appropriate bytecode, and now you’ve got the ability to write your bytecode in Lua. When Lua is no longer in vogue and has been replaced by some other new hotness, that library/agent can be written too without having to change a line of kernel code. Yay for proper abstraction layers and not stuffing interpreters where they don’t belong anyway!
That said, I’ll also point out that we already have a bytecode “engine” in the kernel and a corresponding higher-level language which compiles into it. That language is called D and the bytecode interpreter is the DTrace support code. But all of you already knew that. The fact that Sun only chose to use it for instrumentation and debugging may be coloring everyone’s thinking insofar as what it’s theoretical limits as a more general purpose mechanism are, I don’t know. We can only speculate as to how much farther Sun might have taken it if they had survived as a company (if each dtrace “worker” were a kernel thread, for example, they could have added looping primitives and other features which assumed a longer lifetime for given units of work).
It’s an interesting topic of discussion, that’s for sure. I had a lot of fun with the sandboxing stuff at Apple. It would be interesting to see where FreeBSD could go with an even more general purpose mechanism.
- Jordan
More information about the freebsd-arch
mailing list