mtree "language" enhancements

Tim Kientzle tim at kientzle.com
Sun Nov 29 18:59:45 UTC 2015


Sounds interesting.

Have you talked with Michal (CCed) who is working on a libmtree library?

The capabilities you're describing here really need to be bundled into a library, I think.  In particular, the ability to "unlink", "copy", etc, is much more useful if you can directly query the mtree file contents to perform conditional changes.  (For example, it may be important to remove an empty directory which requires you to be able to query whether a directory has files in it.)

I would also be interested in a description of the processing model.  It sounds like you're assuming the same model used by the current mtree program -- mtree files are processed sequentially line-by-line as they are read.

For instance, libarchive's mtree processor works differently; it reads the entire input, merging redundant lines for the same file, and then processes the list.  This is more explicitly declarative, and simplifies things like modifying the ownership or permissions of already-listed files.

> Each action entry would have an 'action' keyword.

In terms of the language per se, this seems unnecessary.    I've proposed alternate language below that omits the unnecessary "type=action" by just adding new keywords.

> The keywords I've defined
> so far are as follows:
> 1. "unlink" which throws away the previous entry. That entry has been
> removed. It may apply to files or directories, but it is an error not to
> remove all entries in a directory when removing the directory.

# When set on an entry, a matching file on disk will be removed.
# This would also be useful for things like ObsoleteFiles
unlink=true

> 2. "move" which relocates a previous entry. An additional targetpath
> keyword specifies the ultimate destination for this entry.

# When set on an entry, moves the existing file to the new name
rename=<targetpath>

# Example
foo/bar type=file owner=root mode=0755 rename=foo/baz

> 3. "copy" which duplicates a previous entry. It too takes target path.

# As with rename, except it copies the contents.
copy_from=<original>

# properties that are not specified will be copied as well
# Create foo/bar by copying foo/baz, preserving all attributes
foo/bar type=file copy_from=foo/baz
# Create foo/bar as above, but modify the owner
foo/bar owner=dialer type=file copy_from=foo/baz

> 4. "meta" which changes the meta data of the previous entry. All keywords
> on this are merged with the previous entry.

As above, libarchive's mtree processor already does this by default; no language change is needed.

> The one other thing that my merging tool does is to remove all size
> keywords. ... [comments about modifying existing files]

One common case here is appending new contents to an existing file.  That could similarly be handled with the same pattern:

# Append from source
foo/bar append_from=<target path>

In particular, that removes the need to find the source file to modify it in-place.  I've run into various headaches with Crochet when the /usr/obj layout changes between releases and Crochet cannot find the new location of a file.  This would remove the need to always modify the file in-place.  (But not all.)

Cheers,

Tim



> On Nov 29, 2015, at 10:04 AM, Warner Losh <imp at bsdimp.com> wrote:
> 
> Greetings,
> 
> As part of making NanoBSD buildable by non-root, I've found a need to have
> a richer mtree language than we currently have.
> 
> mtree started out as a language to express hierarchies of files. It does a
> decent job at that, even if some of the tools that we have in the tree
> aren't so great about manipulating them. One could easily wish for better
> tools, but that's not the topic of this thread.
> 
> So, I've started to move the language into one that can also journal
> changes to a tree, and have been moving NanoBSD to using wrappers that do
> the changes to the tree and record the journal events at the end of the
> metalog produced from buildworld. I have a second tool that reads the meta
> log, and applies the actions to the earlier entries and then produces a
> final metalog that's used for makefs. These tools are still evolving, but
> before I got too close to the point of committing, I thought I'd post a
> proposed extension to mtree for comments so I don't have to change too much.
> 
> I'd like a new type called 'action' (so type=action in the records). This
> type is defined loosely to manipulate and earlier entry (or maybe entries,
> still unsure) in the file.
> 
> Each action entry would have an 'action' keyword. The keywords I've defined
> so far are as follows:
> 1. "unlink" which throws away the previous entry. That entry has been
> removed. It may apply to files or directories, but it is an error not to
> remove all entries in a directory when removing the directory.
> 2. "move" which relocates a previous entry. An additional targetpath
> keyword specifies the ultimate destination for this entry.
> 3. "copy" which duplicates a previous entry. It too takes targetpath.
> 4. "meta" which changes the meta data of the previous entry. All keywords
> on this are merged with the previous entry.
> 
> The one other thing that my merging tool does is to remove all size
> keywords. In the NanoBSD environment, size is irrelevant. Files are
> replaced and appended to all the time in the build process, and it doesn't
> make sense to track the size. makefs fails if the size is different, so
> post-processing of the tree, say to add a new default to
> /etc/defaults/rc.conf or to tweak /etc/ttys to turn on/off a tty (or append
> a new entry) will cause it to fail. I would be nice of mtree could do this,
> but is simply can't (but see above for whining about better tools being
> beyond the scope of this).
> 
> If things go well, we could eventually move these extensions into mtree so
> that the post-processing stage is no longer necessary. I'm content to
> maintain the hundred or two lines of awk I've written to implement it. I
> chose awk because it does the job well enough, though python might do it
> better. But I don't want to talk about that choice since right now it is
> purely internal to NanoBSD (though I hope that other build orchestration
> systems like src/release and crochet look to adopt).
> 
> Comments?
> 
> Warner
> _______________________________________________
> freebsd-arch at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
> 



More information about the freebsd-arch mailing list