ZFS on root booting broken somewhere after r270020
Kimmo Paasiala
kpaasial at icloud.com
Thu Sep 11 01:18:51 UTC 2014
> On 11.9.2014, at 3.04, Kimmo Paasiala <kpaasial at icloud.com> wrote:
>
>
>> On 11.9.2014, at 2.41, Steven Hartland <killing at multiplay.co.uk> wrote:
>>
>>
>> ----- Original Message ----- From: "Steven Hartland" <killing at multiplay.co.uk>
>> To: "Kimmo Paasiala" <kpaasial at icloud.com>
>> Cc: <freebsd-stable at freebsd.org>
>> Sent: Wednesday, September 10, 2014 11:36 PM
>> Subject: Re: ZFS on root booting broken somewhere after r270020
>>
>>
>>>
>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>> To: "Steven Hartland" <killing at multiplay.co.uk>
>>> Cc: <freebsd-stable at freebsd.org>
>>> Sent: Wednesday, September 10, 2014 8:26 PM
>>> Subject: Re: ZFS on root booting broken somewhere after r270020
>>>
>>>
>>>>
>>>>> On 9.9.2014, at 19.03, Kimmo Paasiala <kpaasial at icloud.com> wrote:
>>>>>
>>>>>
>>>>>> On 9.9.2014, at 18.53, Steven Hartland <killing at multiplay.co.uk> wrote:
>>>>>>
>>>>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>>>>>> Hi it’s me again. Something that was committed in stable/10 after r271213 up to
>>>>>>> and including r271288 broke ZFS on Root booting in exactly the same way again.
>>>>>>> I know the problem is no longer related to extra kernel modules loaded in
>>>>>>> /boot/loader.conf because I’m loading only the required zfs.ko and opensolaris.ko
>>>>>>> modules. Also, the new vt(4) console that I’m using is not the culprit because the
>>>>>>> same thing happens with kern.vty set to “sc”.
>>>>>>
>>>>>> I've just updated my stable/10 box to r271316 and no problems booting from a ZFS root.
>>>>>>
>>>>>> So first things first what error are you seeing?
>>>>>>
>>>>>> Next what is you're:
>>>>>> * Hardware
>>>>>> * Pool layout
>>>>>>
>>>>>> Regards
>>>>>> Steve
>>>>>
>>>>> The error is the same as before:
>>>>>
>>>>> • Mounting from zfs:rdnzltank/ROOT/default failed with error 5.
>>>>>
>>>>> Followed by the mountroot prompt and I get only these devices to choose from, no sign of the ZFS pool:
>>>>>
>>>>> • mountroot>
>>>>> • List of GEOM managed disk devices:
>>>>> • gpt/fb10disk1 gpt/fb10swap1 diskid/DISK-S13UJDWS301624p3 diskid/DISK-S13UJDWS301624p2 diskid/DISK-S13UJDWS301624p1 ada0p3 ada0p2 ada0p1 diskid/DISK-S13UJDWS301624 ada0
>>>>>
>>>>> Hardware is a Gigabyte GA-D510UD Mini-ITX motherboard:
>>>>>
>>>>> http://www.gigabyte.com/products/product-page.aspx?pid=3343#ov
>>>>>
>>>>> 4GBs of RAM. One 750GB Samsung HD753LJ 3.5” SATA HD on the Intel SATA controller.
>>>>>
>>>>> Pool layout:
>>>>>
>>>>> pool: rdnzltank
>>>>> state: ONLINE
>>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 09:27:48 2014
>>>>> config:
>>>>>
>>>>> NAME STATE READ WRITE CKSUM
>>>>> rdnzltank ONLINE 0 0 0
>>>>> gpt/fb10disk1 ONLINE 0 0 0
>>>>>
>>>>> errors: No known data errors
>>>>>
>>>>> Output of ‘gpart show’:
>>>>>
>>>>> freebsd10 ~ % gpart show
>>>>> => 34 1465146988 ada0 GPT (699G)
>>>>> 34 2014 - free - (1.0M)
>>>>> 2048 1024 1 freebsd-boot (512K)
>>>>> 3072 1024 - free - (512K)
>>>>> 4096 16777216 2 freebsd-swap (8.0G)
>>>>> 16781312 1448365710 3 freebsd-zfs (691G)
>>>>>
>>>>>
>>>>> HTH,
>>>>>
>>>>> -Kimmo
>>>>
>>>>
>>>> More information. This version still works:
>>>>
>>>> FreeBSD freebsd10.rdnzl.info 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271237: Wed Sep 10 11:00:15 EEST 2014 root at buildstable10amd64.rdnzl.info:/usr/obj/usr/src/sys/GENERIC amd64
>>>>
>>>> The next higher version r271238 breaks booting for me. The commit in question is this one:
>>>>
>>>> http://svnweb.freebsd.org/base?view=revision&sortby=rev&sortdir=down&revision=271238
>>>
>>> Investigating, had no reports of issues while this has been in head.
>>
>> I've just installed a stable/10 kernel, specifically:
>> 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #11 r271316M
>>
>> and booted fine from a mirrored root without issue:
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> tank ONLINE 0 0 0
>> mirror-0 ONLINE 0 0 0
>> ada0p3 ONLINE 0 0 0
>> ada2p3 ONLINE 0 0 0
>>
>> gpart show ada0 ada2
>> => 34 250069613 ada0 GPT (119G)
>> 34 128 1 freebsd-boot (64K)
>> 162 8388608 2 freebsd-swap (4.0G)
>> 8388770 241680877 3 freebsd-zfs (115G)
>>
>> => 40 586072288 ada2 GPT (279G)
>> 40 128 1 freebsd-boot (64K)
>> 168 8388608 2 freebsd-swap (4.0G)
>> 8388776 577683552 3 freebsd-zfs (275G)
>>
>> I then detached the second disk so the machine had just:
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> tank ONLINE 0 0 0
>> ada0p3 ONLINE 0 0 0
>>
>> Rebooted and again all fine no issues
>>
>> I've also got a raidz1 box on the same kernel it too is fine.
>>
>> => 34 500118125 ada0 GPT (238G)
>> 34 128 1 freebsd-boot (64K)
>> 162 500117997 2 freebsd-zfs (238G)
>> ...
>>
>> So its seems like there's something odd about your environment, especially
>> given you've had a similar issue before.
>>
>> So the questions:
>> 1. What does zpool get all report?
>> 2. What does /boot/loader.conf have in it?
>> 3. What does zdb -C rdnzltank report?
>> 4. What does /etc/rc.conf have in it?
>>
>> Regards
>> Steve
>
> Here goes:
>
> freebsd10 ~ % zpool get all rdnzltank
> NAME PROPERTY VALUE SOURCE
> rdnzltank size 688G -
> rdnzltank capacity 9% -
> rdnzltank altroot - default
> rdnzltank health ONLINE -
> rdnzltank guid 5382786142589818227 default
> rdnzltank version - default
> rdnzltank bootfs rdnzltank/ROOT/default local
> rdnzltank delegation on default
> rdnzltank autoreplace off default
> rdnzltank cachefile - default
> rdnzltank failmode wait default
> rdnzltank listsnapshots off default
> rdnzltank autoexpand off default
> rdnzltank dedupditto 0 default
> rdnzltank dedupratio 1.00x -
> rdnzltank free 622G -
> rdnzltank allocated 66.2G -
> rdnzltank readonly off -
> rdnzltank comment - default
> rdnzltank expandsize 0 -
> rdnzltank freeing 0 default
> rdnzltank fragmentation 20% -
> rdnzltank leaked 0 default
> rdnzltank feature at async_destroy enabled local
> rdnzltank feature at empty_bpobj active local
> rdnzltank feature at lz4_compress active local
> rdnzltank feature at multi_vdev_crash_dump enabled local
> rdnzltank feature at spacemap_histogram active local
> rdnzltank feature at enabled_txg active local
> rdnzltank feature at hole_birth active local
> rdnzltank feature at extensible_dataset enabled local
> rdnzltank feature at embedded_data active local
> rdnzltank feature at bookmarks enabled local
> rdnzltank feature at filesystem_limits enabled local
>
> freebsd10 ~ % cat /boot/loader.conf
>
> kern.geom.label.gptid.enable=0
> hw.usb.no_pf=1
> kern.cam.ada.legacy_aliases=0
> zfs_load="YES"
> vfs.zfs.prefetch_disable=0
> kern.vty=vt
>
> I have already tried without the gptid and legacy_aliases options, no difference. The prefetch_disable was at the default setting 1 when the problem appeared. The hw.usb.no_pf setting shouldn’t have an effect but I can test it once I can reboot the machine again. I’m attaching a second disk at the moment to make a mirror of the pool. The kern.vty setting didn’t make a difference.
>
> The next is now with the second disk being resilvered, gpt/fb10disk2 is the new disk:
>
> MOS Configuration:
> version: 5000
> name: 'rdnzltank'
> state: 0
> txg: 1634460
> pool_guid: 5382786142589818227
> hostid: 852094392
> hostname: 'freebsd10.rdnzl.info'
> vdev_children: 1
> vdev_tree:
> type: 'root'
> id: 0
> guid: 5382786142589818227
> children[0]:
> type: 'mirror'
> id: 0
> guid: 6268049119730836293
> whole_disk: 0
> metaslab_array: 34
> metaslab_shift: 32
> ashift: 9
> asize: 741558452224
> is_log: 0
> create_txg: 4
> children[0]:
> type: 'disk'
> id: 0
> guid: 1732695434302750511
> path: '/dev/gpt/fb10disk1'
> phys_path: '/dev/gpt/fb10disk1'
> whole_disk: 1
> DTL: 98
> create_txg: 4
> children[1]:
> type: 'disk'
> id: 1
> guid: 15812067837864729710
> path: '/dev/gpt/fb10disk2'
> phys_path: '/dev/gpt/fb10disk2'
> whole_disk: 1
> DTL: 526
> create_txg: 4
> resilver_txg: 1634424
> features_for_read:
> com.delphix:hole_birth
> com.delphix:embedded_data
>
> I don’t think have anything in /etc/rc.conf that would have an effect at the time when kernel tries to mount the root filesystem but here it is:
>
> hostname="freebsd10.rdnzl.info"
> keymap="fi.kbd"
>
> #cloned_interfaces="lo1"
> #ifconfig_vtnet0="SYNCDHCP"
> ifconfig_re0="inet 10.71.14.12/24"
> #ifconfig_re0_alias0="inet 10.71.14.112/24"
> defaultrouter="10.71.14.1"
> #gateway_enable="YES"
>
> ipv6_activate_all_interfaces="YES"
> #ifconfig_vtnet0_ipv6="accept_rtadv"
> ifconfig_re0_ipv6="inet6 2001:14b8:100:ZZZZ::XXXX/64"
> ipv6_defaultrouter="2001:14b8:100:ZZZZ::1"
> #ipv6_gateway_enable="YES"
>
> #pf_enable="YES"
> #pflog_enable="YES"
> #pflog_flags="-d 10 -s 256"
>
> zfs_enable="YES"
>
> #devfs_load_rulesets=YES
>
> sshd_enable="YES"
> # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
> dumpdev="AUTO"
>
> clear_tmp_enable="YES"
>
> sendmail_enable="NO"
> sendmail_submit_enable="NO"
> sendmail_outbound_enable="NO"
> sendmail_msp_queue_enable="NO"
>
> rpcbind_enable="YES"
> nfs_server_enable="YES"
> mountd_enable="YES"
>
> #nfsv4_server_enable="YES"
> #nfsuserd_enable="YES"
> #mountd_flags="-r"
>
> ntpd_enable="YES"
> ntpd_sync_on_start="YES"
>
> jail_enable="YES"
> jail_list="buildstable10amd64 buildreleng100i386"
>
> #ntpdate_enable="YES"
> #ntpdate_hosts="10.71.14.1"
>
> nginx_enable="YES"
>
>
> #mdnsresponderposix_enable="YES"
> mdnsresponderposix_flags="-f /usr/local/etc/mDNSResponder.conf"
>
>
> #openntpd_enable="YES"
>
> #avahi_daemon_enable="YES"
> #dbus_enable="YES"
> mdnsd_enable="YES"
>
> smartd_enable="YES"
>
> dma_flushq_enable=“YES”
>
> -Kimmo
>
Just a thought. Is my problem related to the use of GPT labeled partitions in my pool configuration? Your testing shows just "raw" devices like ada0p3 etc.
-Kimmo
More information about the freebsd-stable
mailing list