RFC: enhancing the root mount logic

Tue Aug 24 02:27:26 UTC 2010

On Aug 23, 2010, at 5:14 PM, M. Warner Losh wrote:

> : > And how do you emulate the mount_foo programs for
> : > foo filesystems?  Some of them do weird things that might not
> : > translate well into the kernel...
> : 
> : True. I haven't flushed that out, but I was hoping that nmount(2)
> : would have normalized most of this that it's a non-issue, provided
> : we support mount options in this scheme.
> : 
> : If you have a concrete example of something that's not so trivial,
> : but critical to support, let me know and I'll take it into account.
> 
> mount_smbfs makes a connection to the remote system to do
> authentication presently in mount_smbfs and initializes the smb
> context before mounting the file system in the kernel.  I don't know
> if I'd call this a critical to support feature, but it was the first
> "exception" to the rule that jumped into my head so I was curious if
> you'd thought about it.

smbfs is definitely out of scope :-)

> : > As you can see, I'm torn about how I feel about the idea.  For simple
> : > cases, I think it is great, but as complexity builds, I become less
> : > sure.  What if that iso image was compressed?
> : 
> : Can you elaborate how this is potentially a problem in this scheme,
> : but not for "manual" mounting?
> 
> You'd need a way to stack up different modules, since you'd need
> geom_uzip over md0 to make it useful to the cd9660 code.

This is a perfect example, actually. I'll think about this in the
context of my idea...

> init(8) is the show stopper to a pivot root approach, unless you could
> tell init that's on the first level and simple to exec /sbin/init to
> pickup the new copy, but I don't know how happy that would make the
> kernel..

I think a handshake is doable. If all else fails, you
simply tell the kernel to always re-exec init when
it exits (rather than panicing, which isn't exactly
a product-friendly response to init exiting).

> and if we had one more layer on nand:
> 
> Filesystem     1024-blocks     Used    Avail Capacity  Mounted on
> /dev/nor0             4096     4096    	   0     110%  /
> /dev/md0.uzip	     16000    16000	   0	 110%  /
> /dev/nand0	    320000   300000    20000      82%  /
> 
> or
> 
> Filesystem     1024-blocks     Used    Avail Capacity  Mounted on
> /dev/nor0             4096     4096    	   0     110%  /.old_root/.old_root
> /dev/md0.uzip	     16000    16000	   0	 110%  /.old_root
> /dev/nand0	    320000   300000    20000      82%  /
> 
> is the question I'm asking...

I think it would be:

/dev/nor0	/.old_root
/dev/md0.uzip	/.old_root
/dev/nand0	/

> Anyway, the fact that we have a decoupled fork/exec really is what
> lead me to ask the question.  It is useful to run arbitrary code
> between the two, even if you usually run the same code...  sometimes
> you want to be different.  I was thinking that this might be the same 
> way here.  But, as you rightly point out, maybe there's too much
> complexity in doing that and simpler is better.

I'll chew on the geom_uzip example you gave. There's value
in allowing the full power of GEOM when doing a root mount.

Thanks,

-- 
Marcel Moolenaar
xcllnt at mac.com