ixgbe(4) spin lock held too long
John Baldwin
jhb at freebsd.org
Fri Oct 24 18:47:01 UTC 2014
On Thursday, October 23, 2014 02:12:44 PM Jason Wolfe wrote:
> On Sat, Oct 18, 2014 at 4:42 AM, John Baldwin <jhb at freebsd.org> wrote:
> > On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
> >> Producing 10G of random traffic against a server with this assertion
> >> added took about 2 hours to panic, so if it turns out we need anything
> >> further it should be pretty quick.
> >>
> >> #4 list
> >> 2816 * timer and remember to restart (more output or
> >> persist). 2817 * If there is more data to be acked,
> >> restart retransmit 2818 * timer, using current
> >> (possibly backed-off) value. 2819 */
> >> 2820 if (th->th_ack == tp->snd_max) {
> >> 2821 tcp_timer_activate(tp, TT_REXMT, 0);
> >> 2822 needoutput = 1;
> >> 2823 } else if (!tcp_timer_active(tp, TT_PERSIST))
> >> 2824 tcp_timer_activate(tp, TT_REXMT,
> >> tp->t_rxtcur);>
> > Bah, this is just a bug in my assertion. Rather than having a separate
> > tcp_timer_deactivate() routine, a delta of 0 passed to
> > tcp_timer_activate()
> > means "stop the timer". My assertions were incorrect and need to exclude
> > the stop case. Here is an updated patch (or you can just fix yours
> > locally):
> >
> > Index: tcp_timer.c
> > ===================================================================
> > --- tcp_timer.c (revision 273219)
> > +++ tcp_timer.c (working copy)
> > @@ -869,10 +869,16 @@ tcp_timer_activate(struct tcpcb *tp, int timer_typ
> >
> > case TT_REXMT:
> > t_callout = &tp->t_timers->tt_rexmt;
> > f_callout = tcp_timer_rexmt;
> >
> > + if (callout_active(&tp->t_timers->tt_persist) &&
> > + delta != 0)
> > + panic("scheduling retransmit with persist
> > active");>
> > break;
> >
> > case TT_PERSIST:
> > t_callout = &tp->t_timers->tt_persist;
> > f_callout = tcp_timer_persist;
> >
> > + if (callout_active(&tp->t_timers->tt_rexmt) &&
> > + delta != 0)
> > + panic("scheduling persist with retransmit
> > active");>
> > break;
> >
> > case TT_KEEP:
> > t_callout = &tp->t_timers->tt_keep;
> >
> > --
> > John Baldwin
>
> John,
>
> panic: tcp_setpersist: retransmit pending
>
> (kgdb) bt
> #0 doadump (textdump=1) at pcpu.h:219
> #1 0xffffffff806facb1 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:452
> #2 0xffffffff806fb014 in panic (fmt=<value optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:759
> #3 0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
> /usr/src/sys/netinet/tcp_output.c:1619
> #4 0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
> at /usr/src/sys/netinet/tcp_timer.c:467
> #5 0xffffffff8070d95e in softclock_call_cc (c=0xfffff804ec124ec0,
> cc=0xffffffff81263380, direct=0)
> at /usr/src/sys/kern/kern_timeout.c:687
> #6 0xffffffff8070dce4 in softclock (arg=<value optimized out>) at
> /usr/src/sys/kern/kern_timeout.c:816
> #7 0xffffffff806d16f3 in intr_event_execute_handlers (p=<value
> optimized out>, ie=0xfffff80015214400)
> at /usr/src/sys/kern/kern_intr.c:1263
> #8 0xffffffff806d2056 in ithread_loop (arg=0xfffff800151f7ee0) at
> /usr/src/sys/kern/kern_intr.c:1276
> #9 0xffffffff806cf481 in fork_exit (callout=0xffffffff806d1fc0
> <ithread_loop>, arg=0xfffff800151f7ee0,
> frame=0xfffffe1f9e9b0ac0) at /usr/src/sys/kern/kern_fork.c:996
> #10 0xffffffff80a67c0e in fork_trampoline () at
> /usr/src/sys/amd64/amd64/exception.S:606
>
> (kgdb) frame 3
> #3 0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
> /usr/src/sys/netinet/tcp_output.c:1619
> 1619 panic("tcp_setpersist: retransmit pending");
> (kgdb) list
> 1614 int t = ((tp->t_srtt >> 2) + tp->t_rttvar) >> 1;
> 1615 int tt;
> 1616
> 1617 tp->t_flags &= ~TF_PREVVALID;
> 1618 if (tcp_timer_active(tp, TT_REXMT))
> 1619 panic("tcp_setpersist: retransmit pending");
> 1620 /*
> 1621 * Start/restart persistance timer.
> 1622 */
> 1623 TCPT_RANGESET(tt, t * tcp_backoff[tp->t_rxtshift],
>
> (kgdb) up
> #4 0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
> at /usr/src/sys/netinet/tcp_timer.c:467
> 467 tcp_setpersist(tp);
> (kgdb) list
> 462 (ticks - tp->t_rcvtime) >= TCPTV_PERSMAX) {
> 463 TCPSTAT_INC(tcps_persistdrop);
> 464 tp = tcp_drop(tp, ETIMEDOUT);
> 465 goto out;
> 466 }
> 467 tcp_setpersist(tp);
> 468 tp->t_flags |= TF_FORCEDATA;
> 469 (void) tcp_output(tp);
> 470 tp->t_flags &= ~TF_FORCEDATA;
>
> Jason
Weird, this is the same as before. It should have panic'd when it scheduled
either one of the timers before this. Can you get a stack trace from the
other threads? Perhaps the timers are being scheduled concurrently?
Can you also 'set print pretty' and 'p *tp'?
--
John Baldwin
More information about the freebsd-net
mailing list