FreeBSD-6 amr and ahd trouble
Scott Long
scottl at samsco.org
Wed Nov 16 08:10:54 PST 2005
Joerg Pulz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Hi guys,
>
> I'm running an Fujitsu-Siemens Primergy RX300 dual-XEON hyperthreading
> enabled server with an onboard LSI MegaRAID controller and an Adaptec
> 39320A Ultra320 dual channel SCSI adapter. The LSI MegaRAID controller
> is configured to RAID1 with two disk and one hotspare. On this array
> FreeBSD is installed.
> Up to now, the system was running fine with FreeBSD-5.3 first and
> FreeBSD-5.4 now.
> I tried to upgrade this beast to FreeBSD-6.0-RELEASE without success.
> The kernel is booting and detects all devices correctly but when it
> comes to read from the amr(4) the last thing i see is "GEOM: new disk
> amrd0" after that the system "hangs" and its nearly impossible to scroll
> the kernel messages up or down (Scroll lock pressed). then after a while
> there are a lot of SCSI error messages about SCB timeouts coming from
> the ahd(4).
> I decided to boot the old RELENG_5_4 kernel and cvsup'ed the sources to
> RELENG_6 but i got the same results. booting from a FreeBSD-6.0-RELEASE
> bootonly CDRom got again the same results.
> I searched google about this, and found something about a tuneable
> sysctl/loader setting called hw.pci.do_powerstate and tried it, but the
> same result. later i saw, that in RELENG_6 this tuneable is renamed and
> set to 0 anyway.
> the next step was removing the Adaptec card to make sure this one is not
> interrupting the amr(4) but the only thing that happened was the SCSI
> error messages going away so this was not the problem.
> I decided to give CURRENT from today a try, and it was working without
> any problems. I have tested CURRENT some steps back until i hit 700003
> dated to "Sun Sep 18 05:12:39 2005 UTC" which is exactly the same time
> the RELENG_6 branch was marked for 6.0-BETA5 and CURRENT was working
> with every point i checked out from cvs. Unfortunately 6.0-BETA5 is NOT
> working.
> I checked out the sources for 6.0-BETA4 and it is working again. So
> somewhere between 6.0-BETA4 and 6.0-BETA5 the whole thing is broken, at
> least for me and my hardware.
> I've seen some differences in sys/cam/cam_xpt.c, maybe these cause the
> trouble i have, but I'm not so deep in the FreeBSD kernel code to make
> this sure.
>
> It would be nice if someone can take a look at this to get this fixed in
> RELENG_6.
> Any patches to test are welcome.
>
> regards
> Joerg
>
This is almost certainly an interrupt routing bug. Can you try booting
with ACPI disabled? Can you try building a 6.0 kernel without SMP and
the 'apic' devices? From 5.4, can you send your system information?
Scott
More information about the freebsd-stable
mailing list