6.2: reproducible hang on amd64, traced to 24h of commits
Deomid Ryabkov
myself at rojer.pp.ru
Fri Mar 30 14:03:29 UTC 2007
ok, now that the machine has been up for 10 days, i am reasonably sure
i've close enough to this one.
back in january i cvsupped to -STABLE and the box (dual head opteron
box) started hanging.
and i mean it dies completely.
i have all debug options and a working serial console, but still it just
dies and both serial and system console are unresponsive.
no panic message on either, nothing. pretty sad.
the kernel config is vanilla SMP GENERIC, with all debug options i could
think of enabled (after it started hanging).
so the first thing i did after rebooting the box a couple of times is
fall back to kernel.old (6.1-STABLE circa august '06).
no hangs. i then started incrementally updating, gradually getting
closer to jan 22.
long story short, i seem to have isolated the problem to commits made
between
date=2006.12.28.00.00.00 and date=2006.12.29.00.00.00.
last hang i had was when running the 12/29 kernel, now it's 12/28 and
the box has been up for 2 weeks already.
based on previois experience i'm pretty certain that this is it. with
bad kernel the box would never stay up more than a few days, never more
than 5.
between 12/28 and 12/29 i see some changes to /sys/amd64/ and /sys/pci/,
which might've be the cause.
i will probably start looking into individual changes, but if anyone
more experienced than me could take a look, it'd be appreciated.
i am willing to try patches.
i confirmed that recent (as of 3 weeks or so) -STABLE still has this
problem.
thanks in advance.
====
files under /sys that were changed between 12/28 and 12/29:
Edit src/sys/amd64/amd64/mptable_pci.c
Edit src/sys/amd64/pci/pci_bus.c
Edit src/sys/contrib/dev/ath/public/wackelf.c
Edit src/sys/dev/acpica/acpi_pci.c
Edit src/sys/dev/acpica/acpi_pcib_acpi.c
Edit src/sys/dev/acpica/acpi_pcib_pci.c
Checkout src/sys/dev/ath/if_ath.c
Edit src/sys/dev/cardbus/cardbus.c
Edit src/sys/dev/drm/drm_agpsupport.c
Edit src/sys/dev/pci/pci.c
Edit src/sys/dev/pci/pci_if.m
Edit src/sys/dev/pci/pci_pci.c
Edit src/sys/dev/pci/pci_private.h
Edit src/sys/dev/pci/pcib_private.h
Edit src/sys/dev/pci/pcivar.h
Edit src/sys/i386/i386/mptable_pci.c
Edit src/sys/i386/pci/pci_bus.c
Edit src/sys/kern/subr_bus.c
Checkout src/sys/netgraph/ng_deflate.h
Edit src/sys/pci/agp.c
Edit src/sys/pci/agpreg.h
Edit src/sys/powerpc/ofw/ofw_pcib_pci.c
Edit src/sys/sparc64/pci/apb.c
Edit src/sys/sparc64/pci/ofw_pcib.c
Edit src/sys/sparc64/pci/ofw_pcibus.c
Edit src/sys/sys/param.h
====
kernel configuration used:
include GENERIC
options SMP
options KDB
options DDB
makeoptions DEBUG=-g
options INVARIANTS
options INVARIANT_SUPPORT
options WITNESS
options DEBUG_LOCKS
options DEBUG_VFS_LOCKS
options DIAGNOSTIC
====
--
Deomid Ryabkov aka Rojer
myself at rojer.pp.ru
rojer at sysadmins.ru
ICQ: 8025844
More information about the freebsd-hackers
mailing list