Bug Report: IBM x3650M4 (32GB, 2x4-core Xeon E5-2600,
IBM ServeRaid M5110e): fails in install with NMI
John Baldwin
jhb at freebsd.org
Wed Aug 29 12:25:49 UTC 2012
On Tuesday, August 28, 2012 5:06:18 pm Mike A wrote:
> On Tue, Aug 28, 2012 at 12:38:47PM -0400, John Baldwin wrote:
> >
> > When the loader menu pops up, choose the "escape to loader prompt" option,
> > then type 'set hint.mpt.0.msi_enable=0' followed by 'boot'. There's no
> > guarantee this will help, btw, just something to try out first.
> >
> > If that doesn't work, you can also try setting 'machdep.kdb_on_nmi=0' using
> > the same trick.
> >
> > If that still doesn't help, please boot another OS that does and get the
> > output of 'lspci -v' or 'pciconf -lvb' or equivalent so we can see exactly
> > which mpt adapter it is. I think there is one class of mpt(4) cards that
> > we do not yet support properly. Ah, yes, this PR:
> >
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=149220
> >
> > I think this may in fact be your adapter. This was fixed after 9.0, so try
> > a 9.1-RC1 install disk instead and see if it works better.
>
> No joy. In sober fact, neither 9.1 nor 9.0 will even boot reliably to the
> point where the usual dmesg contents are displayed. About 90% of the time,
> 9.0 will hit the DVD reader for a while, then go quiescent, followed by
> the yellow LED signaling an NMI or other serious problem and the bright
> blue flashing LED signaling a halted machine. I have yet to get any display
> out of 9.1 at all. I have changed all the changeables I can: booted from a
> complete power-down, booted from a halted system, etc. I can't see anything
> that always leads to a display or to a failure to display.
>
> It is interesting that a RedHad Enterprise Linux 5.1 (ancient!) DVD booted
> up first crack off the bat. It couldn't find any discs to install to,
> however, though it did inventory the SATA drives in its dmesg output.
>
> I'm about to try a Knoppix DVD, and will post what PCI data I can get
> from that.
>
> I've entered the first loader hint and got no change in symptoms; since
> then, I have not been able to get another display in about 10 tries, and
> hence been unable to enter the first and second loader hints. At about 7
> minutes per try, this is enormously frustrating.
>
> If there is a way to instrument the CD/DVD boot process itself, so that I
> can see what leads up to the failure to display, I am greatly interested
> in doing this. My employer has about $40K invested in these boxes, and
> is interested in getting some good out of them; I'm at least equally
> interested in not annoying my boss. You can have pretty much 100% of my
> work time until I get them on the air or give up and run some flavor of
> Linux; I'd really rather not run Linux.
>
> At this point I don't know whether the problems stem from the RAID adapter
> hosing the CD/DVD boot process, or from some other impediment. It may be
> that this belongs in the amd64 group, instead of the scsi group. I don't
> see a way to tell until I (or you) can determine the cause of the CD/DVD
> boot problems.
>
> Thanks so much for your help so far.
Humm, that is bizarre. All the early bootstrap code just relies on the BIOS
to perform disk I/O, etc. Can you PXE boot these machines? That might be a
way to get the CD out of the picture. I haven't seen any machines with your
symptoms. At the least, if a machine does have a problem with the boot process
due to a bug or some such, it is consistent in having the problem every time,
not suddenly failing after working.
Also, to be honest, the original NMI in itself is a bit odd. If you are having
these problems now I do wonder if there isn't an underlying hardware issue.
Regardless, I think netbooting would be a good thing to look to get the CD/DVD
bit out of the way.
--
John Baldwin
More information about the freebsd-scsi
mailing list