[Bug 192066] New: sysutils/grub2 and ZFS: wrong lz4 endianness

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed Jul 23 17:34:17 UTC 2014


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192066

            Bug ID: 192066
           Summary: sysutils/grub2 and ZFS: wrong lz4 endianness
           Product: Ports Tree
           Version: Latest
          Hardware: Any
                OS: Any
            Status: Needs Triage
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: freebsd-ports-bugs at FreeBSD.org
          Reporter: aaz at q-fu.com

Created attachment 144914
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=144914&action=edit
fix

I am using GRUB to boot the kernel directly from ZFS.

Not long after an upgrade to a recent 10-stable r268881, GRUB stopped being
able
to see the pool and boot. Having completed an appropriate recovery effort and
finally booting the system again, I used gdb on grub-probe to determine that
the
problem was in lz4 decompression of the uberblock.

Here is the problematic code in GRUB 2.00 (with FreeBSD port patches):

grub-core/fs/zfs/zfs_lz4.c:

    #if BYTE_ORDER == BIG_ENDIAN

Apparently <sys/endian.h> isn't included, so those macros expand to 0, and the
code incorrectly assumes a big-endian system. Then based on this assumption it
byte-swaps a 2-byte offset field in the compressed data, which makes the data
appear corrupt, and fails.

I am not sure why this problem happened to manifest just now, since GRUB hasn't
been updated in a while, but I think the recent kernel happens to lz4-compress
the uberblock and earlier kernels happened to lzjb-compress or not compress it,
leaving the problem unnoticed.

This causes disturbing messages like "error: no such device: <pool id>." and
"lz4 decompression failed" at the GRUB prompt, and this:

# grub-probe -d /dev/gpt/mypool
grub-probe: error: unknown filesystem.

The fix is simply adding #include <sys/endian.h> at the top of zfs_lz4.c:

# grub-probe -d /dev/gpt/mypool
zfs

Note I am also using the patch from bug 188524 for the "hole_birth" feature and
I haven't enabled the "embedded_data" feature on my pool yet. A newly created
pool doesn't work in GRUB because of those feature flags, regardless of lz4.

The latest GRUB source uses grub_le_to_cpu16() instead of BYTE_ORDER, so the
problem should resolve itself in future versions.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-ports-bugs mailing list