[Bug 192066] New: sysutils/grub2 and ZFS: wrong lz4 endianness
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Wed Jul 23 17:34:17 UTC 2014
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192066
Bug ID: 192066
Summary: sysutils/grub2 and ZFS: wrong lz4 endianness
Product: Ports Tree
Version: Latest
Hardware: Any
OS: Any
Status: Needs Triage
Severity: Affects Only Me
Priority: ---
Component: Individual Port(s)
Assignee: freebsd-ports-bugs at FreeBSD.org
Reporter: aaz at q-fu.com
Created attachment 144914
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=144914&action=edit
fix
I am using GRUB to boot the kernel directly from ZFS.
Not long after an upgrade to a recent 10-stable r268881, GRUB stopped being
able
to see the pool and boot. Having completed an appropriate recovery effort and
finally booting the system again, I used gdb on grub-probe to determine that
the
problem was in lz4 decompression of the uberblock.
Here is the problematic code in GRUB 2.00 (with FreeBSD port patches):
grub-core/fs/zfs/zfs_lz4.c:
#if BYTE_ORDER == BIG_ENDIAN
Apparently <sys/endian.h> isn't included, so those macros expand to 0, and the
code incorrectly assumes a big-endian system. Then based on this assumption it
byte-swaps a 2-byte offset field in the compressed data, which makes the data
appear corrupt, and fails.
I am not sure why this problem happened to manifest just now, since GRUB hasn't
been updated in a while, but I think the recent kernel happens to lz4-compress
the uberblock and earlier kernels happened to lzjb-compress or not compress it,
leaving the problem unnoticed.
This causes disturbing messages like "error: no such device: <pool id>." and
"lz4 decompression failed" at the GRUB prompt, and this:
# grub-probe -d /dev/gpt/mypool
grub-probe: error: unknown filesystem.
The fix is simply adding #include <sys/endian.h> at the top of zfs_lz4.c:
# grub-probe -d /dev/gpt/mypool
zfs
Note I am also using the patch from bug 188524 for the "hole_birth" feature and
I haven't enabled the "embedded_data" feature on my pool yet. A newly created
pool doesn't work in GRUB because of those feature flags, regardless of lz4.
The latest GRUB source uses grub_le_to_cpu16() instead of BYTE_ORDER, so the
problem should resolve itself in future versions.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-ports-bugs
mailing list