Re: Workaround for a FreeBSD-13.1-STABLE-arm64-aarch64-RPI-20220715-831c6b8edda-251792.img (and more) USB3 boot failure on 8 GiByte RPi4B Rev 1.4, B0T SOC [inaccurate timeouts]
Date: Sat, 16 Jul 2022 18:19:06 UTC
[Less invasive workaround and an explanation of what may be going onto mess up (some?) timeouts from be as long as intended, in this case one for USB3 activity.] On 2022-Jul-15, at 23:14, Mark Millard <marklmi@yahoo.com> wrote: > I've reported the following boot problem to the lists > before but for older stable/13 and releng/13.1 versions. > I thought it had been fixed but it turns out something > else I had done hid the problem. > > Both before and now, it turns out to fail or not based > on using the original config.txt vs. using one with > at least one specific line added that for some reason > avoids the problem. (I'm not blaming RPi* firmware.) > > The failure looks like: > > . . . > Release APs...done > Trying to mount root from ufs:/dev/ufs/rootfs [rw]... > uhub0: 5 ports with 4 removable, self powered > ugen0.2: <vendor 0x2109 USB2.0 Hub> at usbus0 > uhub1 on uhub0 > uhub1: <vendor 0x2109 USB2.0 Hub, class 9/0, rev 2.10/4.21, addr 1> on usbus0 > Root mount waiting for: usbus0 > uhub1: 4 ports with 4 removable, self powered > Root mount waiting for: usbus0 > uhub_reattach_port: port 2 reset failed, error=USB_ERR_TIMEOUT > uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port 2 > mountroot: waiting for device /dev/ufs/rootfs... > Mounting from ufs:/dev/ufs/rootfs failed with error 19. > > Loader variables: > vfs.root.mountfrom=ufs:/dev/ufs/rootfs > vfs.root.mountfrom.options=rw > > Manual root filesystem specification: > <fstype>:<device> [options] > Mount <device> using filesystem <fstype> > and with the specified (optional) option list. > > eg. ufs:/dev/da0s1a > zfs:zroot/ROOT/default > cd9660:/dev/cd0 ro > (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) > > ? List valid disk boot devices > . Yield 1 second (for background tasks) > <empty line> Abort manual input > > mountroot> > > Both types of USB3 SSD boot media that I use get the problem. > One type I've used for many years and another for over > a year. Both USB3 ports lead to failure. > > Similarly, more than one U-Boot version makes no difference > to the observed failure. > > The workaround I've found is as shown below: > > root@generic:~ # diff -u /boot/msdos/config.txt.orig /boot/msdos/config.txt > --- /boot/msdos/config.txt.orig 2022-07-15 02:43:02.000000000 +0000 > +++ /boot/msdos/config.txt 2022-07-15 04:39:30.000000000 +0000 > @@ -9,3 +9,8 @@ > [pi4] > hdmi_safe=1 > armstub=armstub8-gic.bin > +# > +# Local addition that avoids USB3 SSD boot failures that look like: > +# uhub_reattach_port: port ? reset failed, error=USB_ERR_TIMEOUT > +# uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port ? > +force_turbo=1 Following the implications in: https://forums.raspberrypi.com/viewtopic.php?f=29&t=6201&start=425#p180099 QUOTE . . . is done early in boot, before the cpufreq driver is loaded, and so measures the stock (700MHz) frequency. Now this is value is used elsewhere in linux when calibrating delays. E.g. the sdcard driver might call udelay(1) and expect to get a delay of at least one microsecond. However if the ARM is now at 1GHz, it will only get a 0.7 microseconds, which could cause a problem. To help investigate this, I've add a config.txt parameter, initial_turbo, which means turbo will be enabled from boot for this many seconds (up to 60), or until cpufreq driver sets a frequency. END QUOTE I've switched to trying: # diff -u /boot/msdos/config.txt.orig /boot/msdos/config.txt --- /boot/msdos/config.txt.orig 2022-07-15 02:43:02.000000000 +0000 +++ /boot/msdos/config.txt 2022-07-15 05:10:06.000000000 +0000 @@ -9,3 +9,8 @@ [pi4] hdmi_safe=1 armstub=armstub8-gic.bin +# +# Local addition that avoids USB3 SSD boot failures that look like: +# uhub_reattach_port: port ? reset failed, error=USB_ERR_TIMEOUT +# uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port ? +initial_turbo=60 and this was enough to avoid the problems. (I've not tried to find a near minimum for initial_turbo, I just used the maximum.) It looks like stable/13 and releng/13.1 are sensitive to scaling time based on early clock rates that later change to faster rates, making timeouts happen in the wrong time frame. I've not seen evidence of this for main [so: 14]. Or that is my guess at this point, not having driectly verified a time scale difference at the involved code.. I've no clue if it is better for FreeBSD to add-in initial_turbo=60 to its contig.txt vs. doing something to avoid the sensitivity to initial and later varying clock rates. Covering releng/13.1 without an EN (or whatever it is called) would happen via initial_turbo=60 use in the port's config.txt . I do not know if main [so: 14] has different timeout handling that might explain why I've not found the problem in that context. > I do not claim that is the only possibily, just that it > has sufficient in my context. Turned out initial_turbo=60 was another workaround. > As my normal configuration uses "force_turbo=1", I would only > have ever noticed the issue if I'd tried to boot before making > the config.txt changes I normally have in place. Thus, the > issue might have been around for a notable time without my > noticing. > > > I've tried my U-Boot based main [so: 14] boot media without > the "force_turbo=1". It booted just fine. > === Mark Millard marklmi at yahoo.com