ixgbe(4) spin lock held too long
Jason Wolfe
nitroboost at gmail.com
Sat Jan 17 19:21:59 UTC 2015
On Mon, Dec 15, 2014 at 9:23 AM, John Baldwin <jhb at freebsd.org> wrote:
> On Wednesday, December 10, 2014 12:47:02 PM Jason Wolfe wrote:
>> John,
>>
>> So apparently the concurrent timer scheduling was not fixed, though it
>> does seem rarer. We had about 2 weeks of stability, then last night
>> we had a crash on a machine with the 4 KTR options listed in the prior
>> email. I'm not sure if this is relevant, but the only request for the
>> tcp lock that was 'blocked' in the ktrdump (38k entries) happened just
>> before the persist timer was armed:
>>
>> 37898 3 23540056055651738 KTRGRAPH group:"thread", id:"swi4: clock
>> tid 100012", state:"spinning", attributes: lockname:"tcp"
>> 34794 0 23540056027080726 KTRGRAPH group:"thread", id:"irq278:
>> ix0:que 0 tid 100059", state:"blocked", attributes: prio:8,
>> wmesg:"(null)", lockname:"tcp"
>
> I was hoping to see when the rexmit timer was scheduled, but it isn't
> mentioned at all in the ktrdump (granted, it only covers ~125 milliseconds).
> The line at 37897 is the only one that mentions a timer for this connection.
> The dump doesn't include when either timer was scheduled. One thing that you
> can tell from the schedgraph (and from above) is that some other thread was
> accessing this connection's TCB at the same time as the persist timer spun for
> 2 microseconds on the lock before it was able to run, so something did frob
> the state just before this ran. However, all the threads that were running at
> the time were either idle or in the process of switching out.
>
>> 32909 1 23540056008857406 KTRGRAPH group:"thread", id:"swi4: clock
>> tid 100010", state:"spinning", attributes: lockname:"tcp"
>> 29394 4 23540055974204380 KTRGRAPH group:"thread", id:"irq282:
>> ix0:que 4 tid 100067", state:"spinning", attributes: lockname:"tcp"
>> 29390 1 23540055974198218 KTRGRAPH group:"thread", id:"irq279:
>> ix0:que 1 tid 100061", state:"spinning", attributes: lockname:"tcp"
>> 7130 0 23540055791854144 KTRGRAPH group:"thread", id:"irq278:
>> ix0:que 0 tid 100059", state:"spinning", attributes: lockname:"tcp"
>> 5558 5 23540055779687890 KTRGRAPH group:"thread", id:"squid tid
>> 100207", state:"spinning", attributes: lockname:"tcp"
>> 5306 1 23540055776901006 KTRGRAPH group:"thread", id:"irq279:
>> ix0:que 1 tid 100061", state:"spinning", attributes: lockname:"tcp"
>>
>> panic: tcp_setpersist: retransmit pending
>>
>> (kgdb) bt
>> #0 doadump (textdump=1) at pcpu.h:219
>> #1 0xffffffff806fcb11 in kern_reboot (howto=260) at
>> /usr/src/sys/kern/kern_shutdown.c:452
>> #2 0xffffffff806fce74 in panic (fmt=<value optimized out>) at
>> /usr/src/sys/kern/kern_shutdown.c:759
>> #3 0xffffffff8084c853 in tcp_setpersist (tp=<value optimized out>) at
>> /usr/src/sys/netinet/tcp_output.c:1718
>> #4 0xffffffff80854856 in tcp_timer_persist (xtp=0xfffff8003e58e800)
>> at /usr/src/sys/netinet/tcp_timer.c:518
>> #5 0xffffffff8070faeb in softclock_call_cc (c=0xfffff8003e58eac0,
>> cc=0xffffffff8126ce00, direct=0)
>> at /usr/src/sys/kern/kern_timeout.c:682
>> #6 0xffffffff8070feb4 in softclock (arg=<value optimized out>) at
>> /usr/src/sys/kern/kern_timeout.c:810
>>
>> Jason
>>
>> On Tue, Dec 2, 2014 at 1:18 PM, Jason Wolfe <nitroboost at gmail.com> wrote:
>> > So we've resolved some of our other issues, including the concurrent
>> > retrans/persist timer scheduling we saw with
>> > net.inet.tcp.per_cpu_timers on. Seems it may have been fixed by
>> > disabling flowdirector in ixgbe as Adriann mentioned. Running on
>> > 10.1-STABLE from 11/20 (kern.osreldate: 1001502), we are still seeing
>> > the spinlock held too long panic though. I have a kernel compiled
>> > with these options, and we hit it:
>> >
>> > options KTR
>> > options KTR_COMPILE=(KTR_CALLOUT|KTR_SCHED)
>> > options KTR_MASK=(KTR_CALLOUT|KTR_SCHED)
>> > options KTR_ENTRIES=131072
>> >
>> > I had to grab the last 30000 to see actions other than the statclock
>> > entries on the ix queues, if you look at the last 2500 lines or so you
>> > can see the last callouts happening.
>
> I did look at the
> ktrdump via schedgraph and it is definitely true that several threads are all
> blocked on the callout lock:
>
> 62185 5 19987364517280198 KTRGRAPH group:"thread", id:"pagedaemon tid
> 100096", state:"spinning", attributes: lockname:"callout"
> 62184 2 19987364505968920 KTRGRAPH group:"thread", id:"irq280: ix0:que 2
> tid 100063", state:"spinning", attributes: lockname:"callout"
> 62178 4 19987364504400868 KTRGRAPH group:"thread", id:"irq282: ix0:que 4
> tid 100067", state:"spinning", attributes: lockname:"callout"
> 62172 1 19987364504141702 KTRGRAPH group:"thread", id:"irq279: ix0:que 1
> tid 100061", state:"spinning", attributes: lockname:"callout"
> 62166 3 19987364503806636 KTRGRAPH group:"thread", id:"irq281: ix0:que 3
> tid 100065", state:"spinning", attributes: lockname:"callout"
>
> In fact, the only CPU not spinning on this lock is CPU 0. According to
> ktrdump, it was running the idle thread. That said, the ktrdump also claims
> that these threads started running callouts (ones that timeout sleeps
> like msleep(..., hz)) (which is odd because it didn't log a "running" event
> when it got the lock):
>
> 65919 1 19987382901783634 callout 0xffffffff81253030 finished
> 65917 1 19987382901782106 KTRGRAPH group:"thread", id:"irq279: ix0:que 1
> tid 100061", point:"wokeup", attributes: linkedto:"swapper tid 100000"
> 65915 1 19987382901775874 callout 0xffffffff81253030 func
> 0xffffffff8073f9e0 arg 0xffffffff81252ca0
> 64142 1 19987373824962362 callout 0xfffff801e5394850 finished
> 64140 1 19987373824961286 KTRGRAPH group:"thread", id:"irq279: ix0:que 1
> tid 100061", point:"wokeup", attributes: linkedto:"sendmail tid 100167"
> 64138 1 19987373824960070 callout 0xfffff801e5394850 func
> 0xffffffff8073f9e0 arg 0xfffff801e53944c0
> 62721 4 19987366581196864 callout 0xfffff80022426850 finished
> 62719 4 19987366581189888 KTRGRAPH group:"thread", id:"irq282: ix0:que 4
> tid 100067", point:"wokeup", attributes: linkedto:"named tid 100158"
> 62717 4 19987366581189192 callout 0xfffff80022426850 func
> 0xffffffff8073f9e0 arg 0xfffff800224264c0
> 62680 2 19987366404396696 callout 0xfffff800220add10 finished
> ...
>
> So these aren't TCP timers, but td_slpcallout timers. Those are already
> scheduled on multiple CPUs. Maybe we can try moving those all to zero
> instead:
>
> Index: subr_sleepqueue.c
> ===================================================================
> --- subr_sleepqueue.c (revision 275806)
> +++ subr_sleepqueue.c (working copy)
> @@ -382,8 +382,8 @@ sleepq_set_timeout_sbt(void *wchan, sbintime_t sbt
> MPASS(TD_ON_SLEEPQ(td));
> MPASS(td->td_sleepqueue == NULL);
> MPASS(wchan != NULL);
> - callout_reset_sbt_on(&td->td_slpcallout, sbt, pr,
> - sleepq_timeout, td, PCPU_GET(cpuid), flags | C_DIRECT_EXEC);
> + callout_reset_sbt(&td->td_slpcallout, sbt, pr,
> + sleepq_timeout, td, flags | C_DIRECT_EXEC);
> }
>
> /*
>
>
> --
> John Baldwin
Just to round this thread out for anyone that comes around hunting for
similar problems.. it was found to be an issues in the td_slpcallout
timers. HPS later came to the same conclusions based on some issues
he was having, and produced a patch that went into current a few days
ago:
https://svnweb.freebsd.org/base?view=revision&revision=277213
Thanks to both for the time/work.
Jason
More information about the freebsd-net
mailing list