FreeBSD 7.3, reboot after panic: double fault
pluknet
pluknet at gmail.com
Tue Apr 20 14:17:37 UTC 2010
On 20 April 2010 15:48, John Baldwin <jhb at freebsd.org> wrote:
> On Tuesday 20 April 2010 2:53:16 am c0re wrote:
>> Hello All!
>> I've upgraded freebsd from 7.0 to 7.3 and all was good until I tryed to
>> configure gre interface and use ipfw fwd.
>> I'm actually does not know what was the point of failure in my
>> configuration.
>>
>> [ some details snipped ]
>>
>> It worked about one week and then I made some configuration changes:
>> added gre interface and 2 aliases:
>>
>> # cat /etc/rc.conf |grep
>> ifconfig_xl0="inet 192.168.0.10 netmask 255.255.255.0"
>> ifconfig_xl0_alias0="192.168.0.11 netmask 255.255.255.255"
>> ifconfig_xl0_alias1="192.168.0.12 netmask 255.255.255.255"
>> cloned_interfaces="gre0"
>> ifconfig_gre0="inet 192.168.250.6 192.168.250.5 tunnel 192.168.0.12
>> 192.168.200.15 netmask 255.255.255.252 link1 up"
>>
>> and
>>
>> # cat /etc/rc.local
>> #!/bin/sh
>> ipfw add fwd 192.168.250.5 icmp from 192.168.0.11 to any out via xl0
>> ipfw add fwd 192.168.250.5 tcp from 192.168.0.11 443 to any out via xl0
>> ipfw add allow ip from any to any
>>
>> # ifconfig gre0
>> gre0: flags=b050<POINTOPOINT,RUNNING,LINK0,LINK1,MULTICAST> metric 0 mtu
>> 1476
>> tunnel inet 192.168.0.12 --> 192.168.200.15
>> inet 192.168.250.6 --> 192.168.250.5 netmask 0xfffffffc
>>
>> I shutted down gre interface to prevent requests via gre to buggy IP.
>>
>> The main idea of such configurations was: fwd all connections to https to
>> 192.168.0.1 via gre interface.
>> And also I made apache configurations to make it listen on 192.168.0.11 too.
>>
>> And make some tests: ping 192.168.0.11 - was fine, goes via gre. Telnet to
>> 192.168.0.11 443 was fine too. Then I tryed to make browser https
>> connection to 192.168.0.11. Apache showed me certificate warning and I
>> accepted, then in browser nothing happened, it was trying to open page. But
>> server got kernel panic at that moment.
>>
>> At first time I thought that it was some power failure, I tryed 2 more times
>> and got same behaviour.
>>
>> So https works without kernel panic via 192.168.0.10 address but kernel
>> panics when I try do https via 192.168.0.11 address that source-forwarded
>> via gre.
>
> Looks like the TCP output path got stuck in an infinite recursion loop until
> it exhausted the kernel stack:
>
>> # cd /usr/obj/usr/src/sys/MYKERNEL
>> # kgdb kernel.debug /var/crash/vmcore.2
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for details.
>> This GDB was configured as "i386-marcel-freebsd"...
>>
>> Unread portion of the kernel message buffer:
>>
>> Fatal double fault:
>> eip = 0xc08e3ba3
>> esp = 0xccf6dfc4
>> ebp = 0xccf6e274
>> cpuid = 0; apic id = 00
>> panic: double fault
>> cpuid = 0
>> Uptime: 7m14s
>> Physical memory: 235 MB
>> Dumping 35 MB: 20 4
>>
>> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from
>> /boot/kernel/acpi.ko.symbols...done.
>> done.
>> Loaded symbols for /boot/kernel/acpi.ko
>> Reading symbols from /boot/kernel/if_gre.ko...Reading symbols from
>> /boot/kernel/if_gre.ko.symbols...done.
>> done.
>> Loaded symbols for /boot/kernel/if_gre.ko
>> Reading symbols from /boot/kernel/linux.ko...Reading symbols from
>> /boot/kernel/linux.ko.symbols...done.
>> done.
>> Loaded symbols for /boot/kernel/linux.ko
>> #0 doadump () at pcpu.h:196
>> 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td));
>> (kgdb) bt
>> #0 doadump () at pcpu.h:196
>> #1 0xc07f2857 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
>> #2 0xc07f2b29 in panic (fmt=Variable "fmt" is not available.
>> ) at /usr/src/sys/kern/kern_shutdown.c:574
>> #3 0xc0a7ea2b in dblfault_handler () at /usr/src/sys/i386/i386/trap.c:983
>> #4 0xc08e3ba3 in ipfw_chk (args=0xccf6e28c) at
>> /usr/src/sys/netinet/ip_fw2.c:2465
>> #5 0xc08e6ce1 in ipfw_check_out (arg=0x0, m0=0xccf6e390, ifp=0xc25c5c00,
>> dir=2, inp=0xc28ba708) at /usr/src/sys/netinet/ip_fw_pfil.c:248
>> #6 0xc08a1968 in pfil_run_hooks (ph=0xc0c55240, mp=0xccf6e420,
>> ifp=0xc25c5c00, dir=2, inp=0xc28ba708) at /usr/src/sys/net/pfil.c:78
>> #7 0xc08eb6f2 in ip_output (m=0xc2710b00, opt=0x0, ro=0xccf6e3f4, flags=0,
>> imo=0x0, inp=0xc28ba708) at /usr/src/sys/netinet/ip_output.c:443
>> #8 0xc08f4016 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1134
>> #9 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #10 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #11 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #12 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #13 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #14 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #15 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #16 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #17 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #18 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #19 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #20 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #21 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #22 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #23 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #24 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #25 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #26 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #27 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #28 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #29 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #30 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #31 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #32 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #33 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #34 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #35 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #36 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #37 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #38 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #39 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #40 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #41 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #42 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #43 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #44 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #45 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #46 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #47 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #48 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #49 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> ---Type <return> to continue, or q <return> to quit---
>> #50 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #51 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #52 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #53 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269
>> #54 0xc08f4105 in tcp_output (tp=0xc25b2570) at
>> /usr/src/sys/netinet/tcp_output.c:1195
>> #55 0xc08fdcf8 in tcp_usr_send (so=0xc2ac1820, flags=0, m=0xc270ed00,
>> nam=0x0, control=0x0, td=0xc28e2d80) at tcp_offload.h:269
>> #56 0xc0850405 in sosend_generic (so=0xc2ac1820, addr=0x0, uio=0xc28766c0,
>> top=0xc270ed00, control=0x0, flags=0, td=0xc28e2d80) at
>> /usr/src/sys/kern/uipc_socket.c:1243
>> #57 0xc084bf7f in sosend (so=0xc2ac1820, addr=0x0, uio=0xc28766c0, top=0x0,
>> control=0x0, flags=0, td=0xc28e2d80) at /usr/src/sys/kern/uipc_socket.c:1285
>> #58 0xc0833c5b in soo_write (fp=0xc28e84c0, uio=0xc28766c0,
>> active_cred=0xc28e5900, flags=0, td=0xc28e2d80) at
>> /usr/src/sys/kern/sys_socket.c:103
>> #59 0xc082d2e7 in dofilewrite (td=0xc28e2d80, fd=24, fp=0xc28e84c0,
>> auio=0xc28766c0, offset=-1, flags=0) at file.h:257
>> #60 0xc082d5c8 in kern_writev (td=0xc28e2d80, fd=24, auio=0xc28766c0) at
>> /usr/src/sys/kern/sys_generic.c:402
>> #61 0xc082d816 in writev (td=0xc28e2d80, uap=0xccf6fcfc) at
>> /usr/src/sys/kern/sys_generic.c:388
>> #62 0xc0a7f2d5 in syscall (frame=0xccf6fd38) at
>> /usr/src/sys/i386/i386/trap.c:1101
>> #63 0xc0a636a0 in Xint0x80_syscall () at
>> /usr/src/sys/i386/i386/exception.s:262
>> #64 0x00000033 in ?? ()
>> Previous frame inner to this frame (corrupt stack?)
>> (kgdb)
>> (kgdb) quit
>
> tcp_output() calls tcp_mtudisc() if ip_output() returns EMSGSIZE:
>
> case EMSGSIZE:
> /*
> * For some reason the interface we used initially
> * to send segments changed to another or lowered
> * its MTU.
> *
> * tcp_mtudisc() will find out the new MTU and as
> * its last action, initiate retransmission, so it
> * is important to not do so here.
> *
> * If TSO was active we either got an interface
> * without TSO capabilits or TSO was turned off.
> * Disable it for this connection as too and
> * immediatly retry with MSS sized segments generated
> * by this function.
> */
> if (tso)
> tp->t_flags &= ~TF_TSO;
> tcp_mtudisc(tp->t_inpcb, 0);
> return (0);
>
> But tcp_mtudisc() calls tcp_output():
>
> tcpstat.tcps_mturesent++;
> tp->t_rtttime = 0;
> tp->snd_nxt = tp->snd_una;
> tcp_free_sackholes(tp);
> tp->snd_recover = tp->snd_max;
> if (tp->t_flags & TF_SACK_PERMIT)
> EXIT_FASTRECOVERY(tp);
> tcp_output_send(tp);
> return (inp);
>
> I'm not sure why it's not able to figure out the MTU, perhaps folks on net@
> can help. However, it would seem that for the tcp_output() case,
> tcp_mtudisc() should probably not call tcp_output_send(), but instead
> tcp_output() should just loop back up to the top after calling tcp_mtudisc()
> and retry.
>
I'm afraid to be wrong but it looks similar to another report for 8.0-STABLE
(may it be a cross-major version regression somewhere around tcp_mtudisc()?):
http://lists.freebsd.org/pipermail/freebsd-stable/2010-April/056063.html
--
wbr,
pluknet
More information about the freebsd-net
mailing list