Re: zfs support in makefs

From: Mark Johnston <markj_at_freebsd.org>
Date: Fri, 20 May 2022 16:35:25 UTC
On Thu, May 19, 2022 at 06:25:32PM +0000, Brooks Davis wrote:
> On Thu, May 19, 2022 at 01:36:25PM -0400, Allan Jude wrote:
> > On 5/18/2022 7:04 PM, Brooks Davis wrote:
> > > On Wed, May 18, 2022 at 03:03:17PM -0400, Mark Johnston wrote:
> > >> Hi,
> > >>
> > >> For the past little while I've been working on ZFS support in makefs(8).
> > >> At this point I'm able to create a bootable FreeBSD VM image, using the
> > >> standard FreeBSD ZFS layout, and run through the regression test suite
> > >> in bhyve.  I've also been able to create and boot an EC2 AMI.
> > > 
> > > Very cool!
> > > 
> > >> === Interface ===
> > >>
> > >> Creating a pool with a single dataset is easy:
> > >>
> > >> $ makefs -t zfs -s 10g -o poolname=test ./zfs.img /path/to/input
> > >>
> > >> Upon importing such a pool, you'll get a dataset named "test" mounted at
> > >> /test containing everything under /path/to/input.
> > >>
> > >> It's possible to set properties on the root dataset:
> > >>
> > >> $ makefs -t zfs -s 10g -o poolname=test -o fs=test:setuid=off:atime=on ./zfs.img /path/to/input
> > >>
> > >> It's also possible to create additional datasets:
> > >>
> > >> $ makefs -t zfs -s 10g -o poolname=test -o fs=test/ds1:mountpoint=/test/dir1 ./zfs.img /path/to/input
> > >>
> > >> The parameter syntax is
> > >> "-o fs=<dataset name>[:<prop1>=<val1>[:<prop2>=<val2>[:...]]]".  Only a
> > >> few properties are supported, at least for now.
> > >>
> > >> Dataset mountpoints behave the same as they would if created with the
> > >> standard ZFS tools.  So by default the root dataset's mountpoint is
> > >> /test, test/ds1's mountpoint is /test/ds1, etc..  If a dataset overrides
> > >> its default mountpoint, its children inherit that mountpoint.
> > >>
> > >> makefs builds the output filesystem using a single input directory tree.
> > >> Thus, makefs -t zfs requires that at least one of the dataset's
> > >> mountpoints map to /path/to/input; that is, there is a "root" mount
> > >> point.
> > >>
> > >> The -o rootpath parameter defines this root mount point.  By default it's
> > >> "/<poolname>".  All datasets in the pool must have their mountpoints
> > >> under this path, and one dataset's mountpoint must be equal to this
> > >> path.  To build bootable images, one sets -o rootpath=/.
> > >>
> > >> Putting it all together, one can build a image using the standard layout
> > >> with an invocation like this:
> > >>
> > >> makefs -t zfs -o poolname=zroot -s 20g -o rootpath=/ -o bootfs=zroot/ROOT/default \
> > >>      -o fs=zroot:canmount=off:mountpoint=none \
> > >>      -o fs=zroot/ROOT:mountpoint=none \
> > >>      -o fs=zroot/ROOT/default:mountpoint=/ \
> > >>      -o fs=zroot/tmp:mountpoint=/tmp:exec=on:setuid=off \
> > >>      -o fs=zroot/usr:mountpoint=/usr:canmount=off \
> > >>      -o fs=zroot/usr/home \
> > >>      -o fs=zroot/usr/ports:setuid=off \
> > >>      -o fs=zroot/usr/src \
> > >>      -o fs=zroot/usr/obj \
> > >>      -o fs=zroot/var:mountpoint=/var:canmount=off \
> > >>      -o fs=zroot/var/audit:setuid=off:exec=off \
> > >>      -o fs=zroot/var/crash:setuid=off:exec=off \
> > >>      -o fs=zroot/var/log:setuid=off:exec=off \
> > >>      -o fs=zroot/var/mail:atime=on \
> > >>      -o fs=zroot/var/tmp:setuid=off \
> > >>      ${HOME}/tmp/zfs.img ${HOME}/tmp/world
> > >>
> > >> I'll admit this is somewhat clunky, but it doesn't seem worse than what
> > >> we have to do otherwise, see poudriere-image for example:
> > >> https://github.com/freebsd/poudriere/blob/master/src/share/poudriere/image_zfs.sh#L79
> > >>
> > >> What do folks think of this interface?  Is there anything missing, or
> > >> anything that doesn't make sense?
> > > 
> > > I find it slightly confusing that -o options have a default namespace of
> > > pool options unless they have an fs=*: prefix, but making users type
> > > "pool:" for other options doesn't seem to make sense so this is probably
> > > the best solution.
> > > 
> > > The density of data in the filesystem specification does suggest that
> > > someone might want to create a UCL config file format eventually, but
> > > what's here already seems entirely workable.
> > > 
> > > -- Brooks
> > 
> > In normal `zpool create` they use -o for pool properties, and -O for 
> > dataset properties for the root dataset. I wonder if we might also want 
> > -o poolprop=value and -O zroot/var:mountpoint=/var:canmount=off
> > 
> > just to avoid the conceptual collision of those 2 different items.
> 
> Sadly -O is taken in makefs.

Though, -O is already not supported for all filesystem types (cd9660 in
particular).  I'm not sure whether -O is at all useful anymore now that
we have mkimg(1): I presume that -O is useful when you already have a
partitioned disk image and want to fill in one of the partitions with a
filesystem.

There's a suggestion in the thread of having multiple hardlinks of
makefs; we could add a makefs_zfs which handles -O as Allan suggests.

> > One other possible issue: dataset properties can have a : in them, for 
> > user-defined properties. Do we maybe want to use a , to separate them 
> > instead? Although values can contain ,'s (the sharenfs property often 
> > does), so that probably doesn't work either.
> 
> One solution would be to allow the same fs=foo: to be specified multiple
> times (I've not checked if the current code allows this) to add options
> instead of having a separator.  That does make the command line even more
> clunky though.

The current code won't allow this, but it would be easy to add of
course.  Maybe we should support both modes.  Or maybe the real solution
is to introduce a UCL configuration format and keep the command-line
interface simple.

I didn't think about this much yet since makefs currently doesn't
support setting arbitrary properties, just those few that I need to
build FreeBSD images.  I guess I'd want to see some specific use-cases
for specifying additional dataset/pool properties before deciding what
to do.