Re: git: aa52c6bdd7b1 - main - newbus: Create a knob to disable devices that fail to attach.

From: Warner Losh <imp_at_bsdimp.com>
Date: Mon, 05 Dec 2022 21:01:14 UTC
I think I'll just revert the default change... but in a few hours when I'm
back home.

Warner

On Mon, Dec 5, 2022, 12:51 PM Cy Schubert <Cy.Schubert@cschubert.com> wrote:

> In message <202212042333.2B4NX1JM089126@gitrepo.freebsd.org>, Warner Losh
> write
> s:
> > The branch main has been updated by imp:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=aa52c6bdd7b157b7c5d9612e4feac5c9
> > 4f376df6
> >
> > commit aa52c6bdd7b157b7c5d9612e4feac5c94f376df6
> > Author:     Warner Losh <imp@FreeBSD.org>
> > AuthorDate: 2022-12-04 23:20:24 +0000
> > Commit:     Warner Losh <imp@FreeBSD.org>
> > CommitDate: 2022-12-04 23:20:24 +0000
> >
> >     newbus: Create a knob to disable devices that fail to attach.
> >
> >     Normally, when a device fails to attach, we tear down the newbus
> state
> >     for that device so that future driver loads can try again (maybe
> with a
> >     different driver, or maybe with a re-loaded and fixed kld).
> >
> >     Sometimes, however, it is desirable to have the device fail
> >     permanantly. We do this by calling device_disable() on a failed
> >     attached, as well as keeping the device in DS_ATTACHING forever. This
> >     prevents retries on that device. This is enabled via
> >     hw.bus.disable_failed_devices=1 in either a hint via the loader, or
> at
> >     runtime with a sysctl setting. Setting from 1 -> 0 at runtime will
> not
> >     affect previously disabled devices, however: they remain disabled.
> >     They can be re-enabled manually with devctl enable, however.
> >
> >     Sponsored by:           Netflix
> >
> >     Reviewed by:    gallatin, hselasky, jhb
> >     Differential Revision:  https://reviews.freebsd.org/D37517
> > ---
> >  sys/kern/subr_bus.c | 33 +++++++++++++++++++++++++++------
> >  1 file changed, 27 insertions(+), 6 deletions(-)
> >
> > diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
> > index 739484ad849a..b9615b033007 100644
> > --- a/sys/kern/subr_bus.c
> > +++ b/sys/kern/subr_bus.c
> > @@ -69,6 +69,10 @@ SYSCTL_NODE(_hw, OID_AUTO, bus, CTLFLAG_RW |
> CTLFLAG_MPSAF
> > E, NULL,
> >  SYSCTL_ROOT_NODE(OID_AUTO, dev, CTLFLAG_RW | CTLFLAG_MPSAFE, NULL,
> >      NULL);
> >
> > +static bool disable_failed_devs = false;
> > +SYSCTL_BOOL(_hw_bus, OID_AUTO, disable_failed_devices, CTLFLAG_RWTUN,
> &disab
> > le_failed_devs,
> > +    0, "Do not retry attaching devices that return an error from
> DEVICE_ATTA
> > CH the first time");
> > +
> >  /*
> >   * Used to attach drivers to devclasses.
> >   */
> > @@ -2533,12 +2537,29 @@ device_attach(device_t dev)
> >       if ((error = DEVICE_ATTACH(dev)) != 0) {
> >               printf("device_attach: %s%d attach returned %d\n",
> >                   dev->driver->name, dev->unit, error);
> > -             if (!(dev->flags & DF_FIXEDCLASS))
> > -                     devclass_delete_device(dev->devclass, dev);
> > -             (void)device_set_driver(dev, NULL);
> > -             device_sysctl_fini(dev);
> > -             KASSERT(dev->busy == 0, ("attach failed but busy"));
> > -             dev->state = DS_NOTPRESENT;
> > +             if (disable_failed_devs) {
> > +                     /*
> > +                      * When the user has asked to disable failed
> devices, w
> > e
> > +                      * directly disable the device, but leave it in the
> > +                      * attaching state. It will not try to
> probe/attach the
> > +                      * device further. This leaves the device numbering
> > +                      * intact for other similar devices in the system.
> It
> > +                      * can be removed from this state with devctl.
> > +                      */
> > +                     device_disable(dev);
> > +             } else {
> > +                     /*
> > +                      * Otherwise, when attach fails, tear down the
> state
> > +                      * around that so we can retry when, for example,
> new
> > +                      * drivers are loaded.
> > +                      */
> > +                     if (!(dev->flags & DF_FIXEDCLASS))
> > +                             devclass_delete_device(dev->devclass, dev);
> > +                     (void)device_set_driver(dev, NULL);
> > +                     device_sysctl_fini(dev);
> > +                     KASSERT(dev->busy == 0, ("attach failed but
> busy"));
> > +                     dev->state = DS_NOTPRESENT;
> > +             }
> >               return (error);
> >       }
> >       dev->flags |= DF_ATTACHED_ONCE;
> >
>
> When switching a KVM console from a FreeBSD machine to another machine
> attached to the KVM the FreeBSD machine will panic. Photo attached.
>
> This probably needs an UPDATING entry explaining that
> hw.bus.disable_failed_devices sysctl/kenv must be zero (0) to avoid the
> panic.
>
> Sorry for taking so long to report this. Had an appointment this morning.
>
>
> Cheers,
> Cy Schubert <Cy.Schubert@cschubert.com>
> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
>
>                         e^(i*pi)+1=0
>