Re: TPM2 Support in bootloader / kernel in order to retrieve GELI passphrase

From: Stanislaw Adaszewski <s.adaszewski_at_gmail.com>
Date: Sun, 28 Nov 2021 08:13:23 UTC
Hi Ka Ho Ng,

Thanks for the interesting remarks!


> I believe putting that somewhere as container metadata for consumption by geom_eli
> would be more favorable for such usage. That should make it look more tailored given
> encryption is a volume property rather.

The system does not think in terms of GELI anymore when it considers a
rootfs. For a ZFS-on-GELI rootfs it is passed from the bootloader at as
zfs:zroot/ROOT/default: and that's it, no mention of GELI. See below -
the drive could be substituted for a hostile one before the kernel
gets to mounting the rootfs. A check for the /.passphrase_marker is
necessary AFTER vfs_mountroot(). The root ZFS could be as well
spanning multiple GELI devices, etc. It is simply on a higher
level than GELI.


> In a strict environment it might be better to simply drop the prompt and directly
> panic instead.

This could be implemented as well but is currently not done as it
would not be enough to ensure a trusted rootfs. Conversely, after
ANY successful vfs_mountroot(), the rootfs is checked for the
presence of the correct /.passphrase_marker and we panic
if this is not the case.

This makes for a real protection. You could imagine unplugging the boot drive in
the early stages of kernel runtime (before mounting the rootfs) and plugging in
another one with a ZFS pool (with the same name) on an unencrypted
partition - AFAIK the kernel would pick it up without any complaint
and boot straight into a hostile userland while keeping all the secrets
available in memory. In fact, if you re-plugged the boot drive then,
it would use the cached GELI keys and decrypt it straight away.


> Despite tpm2_nvwrite is used instead of sealing. TPM non-volatile storage is not
> something we can spare as they are limited resources. Sealing does not require
> non-volatile storage in TPM module, while achieving what you have described.

The TPM PC platform specification says: "1.The TPM SHALL provide a minimum of
6962 bytes of NV Storage.". The passphrase is only using a dozen. That should be
fine. Aren't sealed persistent objects on the TPM eating into the same
memory anyway? Perhaps you could elaborate?

In either case switching to sealed objects (if there really is a rationale)
would be a minor change.


Kind regards,

--
Stanislaw


On Sat, 27 Nov 2021 at 21:35, Stanislaw Adaszewski
<s.adaszewski@gmail.com> wrote:
>
> Hi Warner,
>
>
> Thanks a lot for the quick reaction - that helps. Accordingly,
> I have taken several actions (below). If you have any more tips
> how to push this forward, please let me know. Like I don't know
> is there a person formally responsible for this kind of
> contributions? Let's say when you are happy with the general
> architecture of the solution and the quality of the code
> (it still requires some polishing) - is that enough to pull
> the changes into the codebase? How does that work?
> Thank you in advance!
>
>
> 1. I have rebased my changeset on top of the tip of the
> FreeBSD's main branch [1]
>
>
> 2. I have changed the /.passphrase_marker convention to hold
> (instead of the passphrase) a human-readable lower-case digest
> of SHA256(salt | passphrase) where salt is a new (optional)
> parameter which can be passed using another EFI variable:
> KernGeomEliPassphraseFromTpm2Salt.
>
> I think it is more for the peace of mind than anything else,
> as anyone having access to /.passphrase_marker would normally
> have to be the root user. Nevertheless it is perhaps nicer not
> to keep raw passphrases laying around in files.
>
> We still pass the passphrase to the kernel via a kernel variable
> kern.geom.eli.passphrase.from_tpm2.passphrase which is unset
> before the system becomes interactive. I could be easily
> passing the hash computed by the bootloader instead - what do
> you think?
>
> One thing to keep in mind is that another kernel variable -
> kern.geom.eli.passphrase has been passed around like this
> as well and it is being unset precisely in init_main.c.
>
> But even more importantly, one has to keep in mind that
> geli_export_key_buffer() passes all GELI keys to the kernel
> anyway, so access to the encrypted drives is already possible
> by that alone. Not to mention that the root user can simply
> read the driver's master key with a simple geli show.
>
>
> 3) As an explanation, also to @Ka Ho Ng, the /.passphrase_marker
> serves only as a tag to allow to reliably pair a boot filesystem
> to the keyphrase retrieved from the TPM. If we were to just
> retrieve the passphrase and pass it to any boot environment then
> one would simply boot another OS with another root password and
> could read all our secrets. The same goes for the root filesystem
> that is mounted in turn by the kernel. If one would for example
> remove the boot drive - that would cause the kernel to drop out
> to interactive mountfrom> prompt and then one could for example
> boot from another drive. That is unacceptable. Kernel is by default
> very "boot happy" - it tries really hard to boot SOMETHING rather
> than accept a strict specification of what is allowed to boot.
>
>
> 4) The code is fully functional and I have tested it quite a bit
> on a Zotac CI622 mini-PC with the latest BIOS update which enables
> the TPM functionality on the Intel 10110U process in there. If you
> have a TPM-capable BIOS and CPU, I would encourage you to try, like
> this you will understand better how it works and that the design is
> necessary like it is with the /.passphrase_marker. I am not 100% sure
> if there are no ways left to fool the system to boot or run some
> arbitrary code. Such attacks would generally consist of causing some
> kind of errors on the "trusted" UFS-on-GELI or ZFS-on-GELI systems and
> making the system drop out into some kind of interactive prompts. I
> hope that does not happen. For example if one removes the drive during init.
> I would certainly expect that no process drops out to interactive prompts
> on a system with a root password set bit this kind of scares is
> a tradeoff of not using full VERIEXEC. It is however very convenient
> to say - just trust everything on the XXX-on-GELI since this was encrypted
> and therefore tampering-proof. More tests are necessary but also - VERIEXEC
> can be enabled in addition to that to ensure that such weird scenarios do not
> happen.
>
>
> 5) @Ka Ho Ng what you said is taking place clearly, the code relies on a PCR
> policy of user's choice and uses that to read the passphrase.
>
>
> 6) Regarding ZFS encryption I am not sure if that is supported in the EFI
> bootloader - at first glance I would say that it isn't. With that said,
> the code can be used to further modify the loader to read any kind of
> values stored in the TPM and put them in kernel variables for later use
> in the boot process. Just a big WARNING, kenv seems to let even lusers to read
> the variables. So whatever one would do with them, it would have to be done
> quickly and the variables would need to be discarded before letting
> the lusers log in.
>
>
> [1] https://github.com/sadaszewski/freebsd-src/compare/main...main-cherry-pick-tpm-support-in-stand
>
>
> Kind regards,
>
> --
> Stanislaw
>
> On Sat, 27 Nov 2021 at 18:00, Warner Losh <imp@bsdimp.com> wrote:
> >
> >
> >
> > On Sat, Nov 27, 2021, 7:36 AM Stanislaw Adaszewski <s.adaszewski@gmail.com> wrote:
> >>
> >> Dear All,
> >>
> >> Could you please guide me so that we can together integrate
> >> the following piece of work into the FreeBSD base system?
> >> Thank you for your time and consideration.
> >
> >
> > See below for some advice.
> >
> >> I have created the following bundle of work [1]. The referenced
> >> patch implements on top of releng/13.0, the support for TPM2
> >> in the EFI bootloader and in the kernel in order to allow for
> >> storage and retrievel of a GELI passphrase in a TPM2 module,
> >> secured with a PCR policy.
> >>
> >> The way the bootloader behavior is modified is the following:
> >>
> >> 1) before calling efipart_inithandles() an attempt to retrieve the
> >> passphrase from a TPM2 module might be performed -
> >> how this is achieved is described later on.
> >> 2) if a passphrase is indeed retrieved, then after determining
> >> currdev, the currdev is checked for the presence of a
> >> /.passphrase_marker file which must contain the same passphrase
> >> as retrieved from the TPM. This is supposed to ensure that we
> >> do not end up booting an environment not on the device we just
> >> unlocked with the passphrase.
> >> 3a) If all is go, the autoboot_delay is set to -1 in order to prevent
> >> any interaction and continue the boot process of the "safe" environment,
> >> a 'kern.geom.eli.passphrase.from_tpm2.passphrase' variable is set
> >> to the passphrase from TPM in order for kernel use later, as well as a
> >> kern.geom.eli.passphrase.from_tpm2.was_retrieved'='1' variable.
> >> 3b) If the passphrase marker does not match, the bootloader cleans up
> >> GELI keys, the TPM passphrase and kern.geom.eli.passphrase and exits.
> >
> >
> > I worry about information disclosure having the pass phrase available on the running system with this setup. Can you explain why that design point was selected? Usually there is something signed with a private key that the public key can be used to verify instead of a direct comparison like this to prevent disclosure of key material. I've not looked at the code yet, so it may already do this...
> >
> >> The way the kernel behavior is modified is the following:
> >>
> >> 1) In init_main.c, after vfs_mountroot() a check is added
> >> 2a) If kern.geom.eli.passphrase.from_tpm2.was_retrieved is not
> >> set to 1, then we do nothing and continue the boot process
> >> 2b) If the was_retrieved variable is set to '1' then we check for the
> >> same passphrase marker as the bootloader, its content compared
> >> against the 'kern.geom.eli.passphrase.from_tpm2.passphrase'
> >> variable.
> >> 3a) If all is go, the passphrase variable is unset and the boot process
> >> continues,
> >> 3c) If the passphrase marker does not match, we panic.
> >
> >
> > I'm sure that main_init should not know about geom or geli. This is usually done with a handler of some sort so the mountroot code can just call the generic handler. Can your code be restructured such that this is possible?  The reason I ask is that ZFS supports encryption too and it would be nice to use that instead of geli.
> >
> >> The configuration of the bootloader for this procedure looks the following:
> >>
> >> 1) We set an efivar KernGeomEliPassphraseFromTpm2NvIndex
> >> to contain the TPM2 NV Index we store our passphrase in, e.g. 0x1000001
> >> 2) We set an efivar KernGeomEfiPassphraseFromTpm2PolicyPcr
> >> to contain the PCR policy used to secure the passphrase, e.g.
> >> sha256:0,2,4,7
> >> 3a) If both are set, the bootloader will attempt to retrieve the passphrase
> >> and behave in the modified way described above
> >> 3b) Otherwise, it behaves as the vanilla version and will ask for GELI
> >> passphrases if necessary
> >>
> >> The configuration of the TPM and the passphrase marker looks the following:
> >>
> >> 1) echo -n "passphrase" >/.passphrase_marker
> >> 2) chmod 600 /.passphrase_marker
> >> 3) tpm2_createpolicy -L policy.digest --policy-pcr -l sha256:0,2,4,7
> >> 4) tpm2_nvdefine -Q 0x1000001 -s `wc -c /.passphrase_marker` -L
> >> policy.digest -A "policyread|policywrite"
> >> 5) tpm2_nvwrite -Q 0x1000001 -i /.passphrase_marker -P pcr:sha256:0,2,4,7
> >>
> >> [1] https://github.com/sadaszewski/freebsd-src/compare/releng/13.0...tpm-support-in-stand
> >
> >
> > This sounds cool. Any chance you can rebase this to the tip of the main branch? All code goes into FreeBSD that way and 13.0 is about a year old already.
> >
> > Thanks for sharing this. Despite some reservations expressed above, I think this has potential to be quite cool.
> >
> > Warner
> >
> >>
> >> Kind regards,