Re: iflib_timer() vs ixl_admin_timer() race
- In reply to: Alexander Motin : "Re: iflib_timer() vs ixl_admin_timer() race"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 13 Jan 2022 00:47:44 UTC
On Wed, Jan 12, 2022 at 10:18 AM Alexander Motin <mav@freebsd.org> wrote: > Thanks, Krzystof, > > Grepping now for iflib_admin_intr_deferred() through the sources I see > the same issue in other Intel NIC drivers, plus bnxt(4). So the main > controversy I see is this: either admin intr should not stop and restart > the callouts (and then question is why it does that now), or if it > should be so heavy-weight, it should not be called so regularly (and > then question why so many drivers do it). > > In few drivers I've found it even more interesting: iflib_timer() calls > IFDI_TIMER(), which calls ixgbe_if_timer(), which calls > iflib_admin_intr_deferred(), which in its turn restart the > iflib_timer(). Ouroboros. ;) > From memory, I believe that call may be related to NICs without dedicated admin intrs. Please keep them in mind when you clean this up. > On 12.01.2022 11:46, Galazka, Krzysztof wrote: > > Hi Alexander, > > > > Thank you for pointing this out. ixl_admin_timer is used by a callout so > > I don’t think we can acquire any locks there. IIRC it was added to let > > tools for NVM update interact with FW while interface is down, so maybe > > stopping it when interface goes UP would be an option. Let me think this > > through. > > > > Thanks, > > > > Krzysiek > > > > *From:* Eric Joyner <erj@freebsd.org> > > *Sent:* Wednesday, January 12, 2022 8:22 AM > > *To:* Alexander Motin <mav@freebsd.org> > > *Cc:* Joyner, Eric <eric.joyner@intel.com>; Galazka, Krzysztof > > <krzysztof.galazka@intel.com> > > *Subject:* Re: iflib_timer() vs ixl_admin_timer() race > > > > Well, I think the situation is more-or-less as you say -- it's just that > > the intent of ixl_admin_timer() is supposed to have the adapter's admin > > queue processed constantly, regardless of interrupt status or down/up > > status. I think as a shorthand we just called _task_fn_admin because it > > handles a bunch of other things as well as getting down to calling > > ixl_if_update_admin_status() which does the admin queue processing. But > > as you found, I don't think it was originally written to be called > > periodically on a regular basis like iflib_timer(), so the callouts are > > interacting badly. > > > > My first thought would be to replace the call to > > iflib_admin_intr_deferred() in ixl_admin_timer() with > > ixl_if_update_admin_status() while taking the CTX_LOCK(). I'm not sure > > everything else in _task_fn_admin() needs to run periodically like that > > in the driver. That would avoid the callouts getting stopped every 500 > > ticks. > > > > I CC'd my coworker Krzysztof who currently owns the driver; he might > > have better thoughts on this. > > > > - Eric > > > > On Tue, Jan 11, 2022 at 10:46 AM Alexander Motin <mav@freebsd.org > > <mailto:mav@freebsd.org>> wrote: > > > > Yes. I've reverted my patch for now to not break anything, but all > > this > > situation does not look right for me too, so I'd appreciate your > look. > > > > On 11.01.2022 12:21, Eric Joyner wrote: > > > Hi, > > > > > > I came back from vacation yesterday, but I'll look into this for > you > > > today. It's not a good situation when changing the period of the > > iflib > > > timer breaks something in the driver... > > > > > > - Eric > > > > > > On Sun, Jan 9, 2022 at 8:19 PM Alexander Motin <mav@freebsd.org > > <mailto:mav@freebsd.org> > > > <mailto:mav@freebsd.org <mailto:mav@freebsd.org>>> wrote: > > > > > > Hi Eric, > > > > > > In 90bc1cf65778 I've done what looked like a trivial > > optimization. But > > > just after committing it I've noticed weird race it created > > between > > > iflib_timer() and ixl_admin_timer() timers. I see ixl(4) > calls > > > iflib_admin_intr_deferred() every 0.5s, which calls > > _task_fn_admin(), > > > which every time stops and restart txq->ift_timer callout (AKA > > > iflib_timer()), which actually has exactly the same period of > > 0.5s. It > > > seems before my change iflib_timer() managed to run once for > > all TX > > > queues before being stopped, but after the change due to > > additional > > > jitter many of callouts are getting stopped before firing. > > > > > > Could you please sched some light for me on what is going on > > there? > > > That race between two callouts looks like potential bug to > > me, working > > > by some coinncedence, especially considering iflib_timer() > > period is > > > tunable. > > > > > > Thanks in advance. > > > > > > -- > > > Alexander Motin > > > > > > > -- > > Alexander Motin > > > > ------------------------------------------------------------------------ > > *Intel Technology Poland sp. z o.o. > > * ul. Słowackiego 173 | 80-298 Gdańsk | Sąd Rejonowy Gdańsk Północ | VII > > Wydział Gospodarczy Krajowego Rejestru Sądowego - KRS 101882 | NIP > > 957-07-52-316 | Kapitał zakładowy 200.000 PLN. > > > > Ta wiadomość wraz z załącznikami jest przeznaczona dla określonego > > adresata i może zawierać informacje poufne. W razie przypadkowego > > otrzymania tej wiadomości, prosimy o powiadomienie nadawcy oraz trwałe > > jej usunięcie; jakiekolwiek przeglądanie lub rozpowszechnianie jest > > zabronione. > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). If you are not the intended > > recipient, please contact the sender and delete all copies; any review > > or distribution by others is strictly prohibited. > > > > -- > Alexander Motin > >