Re: iflib_timer() vs ixl_admin_timer() race
Date: Wed, 12 Jan 2022 17:16:19 UTC
Thanks, Krzystof, Grepping now for iflib_admin_intr_deferred() through the sources I see the same issue in other Intel NIC drivers, plus bnxt(4). So the main controversy I see is this: either admin intr should not stop and restart the callouts (and then question is why it does that now), or if it should be so heavy-weight, it should not be called so regularly (and then question why so many drivers do it). In few drivers I've found it even more interesting: iflib_timer() calls IFDI_TIMER(), which calls ixgbe_if_timer(), which calls iflib_admin_intr_deferred(), which in its turn restart the iflib_timer(). Ouroboros. ;) On 12.01.2022 11:46, Galazka, Krzysztof wrote: > Hi Alexander, > > Thank you for pointing this out. ixl_admin_timer is used by a callout so > I don’t think we can acquire any locks there. IIRC it was added to let > tools for NVM update interact with FW while interface is down, so maybe > stopping it when interface goes UP would be an option. Let me think this > through. > > Thanks, > > Krzysiek > > *From:* Eric Joyner <erj@freebsd.org> > *Sent:* Wednesday, January 12, 2022 8:22 AM > *To:* Alexander Motin <mav@freebsd.org> > *Cc:* Joyner, Eric <eric.joyner@intel.com>; Galazka, Krzysztof > <krzysztof.galazka@intel.com> > *Subject:* Re: iflib_timer() vs ixl_admin_timer() race > > Well, I think the situation is more-or-less as you say -- it's just that > the intent of ixl_admin_timer() is supposed to have the adapter's admin > queue processed constantly, regardless of interrupt status or down/up > status. I think as a shorthand we just called _task_fn_admin because it > handles a bunch of other things as well as getting down to calling > ixl_if_update_admin_status() which does the admin queue processing. But > as you found, I don't think it was originally written to be called > periodically on a regular basis like iflib_timer(), so the callouts are > interacting badly. > > My first thought would be to replace the call to > iflib_admin_intr_deferred() in ixl_admin_timer() with > ixl_if_update_admin_status() while taking the CTX_LOCK(). I'm not sure > everything else in _task_fn_admin() needs to run periodically like that > in the driver. That would avoid the callouts getting stopped every 500 > ticks. > > I CC'd my coworker Krzysztof who currently owns the driver; he might > have better thoughts on this. > > - Eric > > On Tue, Jan 11, 2022 at 10:46 AM Alexander Motin <mav@freebsd.org > <mailto:mav@freebsd.org>> wrote: > > Yes. I've reverted my patch for now to not break anything, but all > this > situation does not look right for me too, so I'd appreciate your look. > > On 11.01.2022 12:21, Eric Joyner wrote: > > Hi, > > > > I came back from vacation yesterday, but I'll look into this for you > > today. It's not a good situation when changing the period of the > iflib > > timer breaks something in the driver... > > > > - Eric > > > > On Sun, Jan 9, 2022 at 8:19 PM Alexander Motin <mav@freebsd.org > <mailto:mav@freebsd.org> > > <mailto:mav@freebsd.org <mailto:mav@freebsd.org>>> wrote: > > > > Hi Eric, > > > > In 90bc1cf65778 I've done what looked like a trivial > optimization. But > > just after committing it I've noticed weird race it created > between > > iflib_timer() and ixl_admin_timer() timers. I see ixl(4) calls > > iflib_admin_intr_deferred() every 0.5s, which calls > _task_fn_admin(), > > which every time stops and restart txq->ift_timer callout (AKA > > iflib_timer()), which actually has exactly the same period of > 0.5s. It > > seems before my change iflib_timer() managed to run once for > all TX > > queues before being stopped, but after the change due to > additional > > jitter many of callouts are getting stopped before firing. > > > > Could you please sched some light for me on what is going on > there? > > That race between two callouts looks like potential bug to > me, working > > by some coinncedence, especially considering iflib_timer() > period is > > tunable. > > > > Thanks in advance. > > > > -- > > Alexander Motin > > > > -- > Alexander Motin > > ------------------------------------------------------------------------ > *Intel Technology Poland sp. z o.o. > * ul. Słowackiego 173 | 80-298 Gdańsk | Sąd Rejonowy Gdańsk Północ | VII > Wydział Gospodarczy Krajowego Rejestru Sądowego - KRS 101882 | NIP > 957-07-52-316 | Kapitał zakładowy 200.000 PLN. > > Ta wiadomość wraz z załącznikami jest przeznaczona dla określonego > adresata i może zawierać informacje poufne. W razie przypadkowego > otrzymania tej wiadomości, prosimy o powiadomienie nadawcy oraz trwałe > jej usunięcie; jakiekolwiek przeglądanie lub rozpowszechnianie jest > zabronione. > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). If you are not the intended > recipient, please contact the sender and delete all copies; any review > or distribution by others is strictly prohibited. > -- Alexander Motin