Any progress? Any hope? Anyone listening?

Erik Elmgren erik.elmgren at swipnet.se
Sat Sep 5 16:49:29 PDT 1998


Quoting Doug Ledford (dledford at dialnet.net):
> Robert G. Brown wrote:
> > 
> > Dear Doug et. al.,
> > 
> > I've verified, unfortunately, that a Dell Poweredge 2300 with onboard
> > 7890 exhibits the following behavior:
> > 
> > a) If directly booted with aic7xxx 5.1.0pre7 and 2.0.35 (SMP), it gets
> > through the timeout reset and enters the Parity Error loop during the
> > device probe.  Sometimes it will return the attached device before
> > spinning out forever.
> > 
> > b) If FIRST booted from pre-installed WinNT (through its aic7xxx
> > initialization) and THEN warm booted into linux with the same kernel
> > used in a), it does the timeout reset, finds both attached devices (or
> > all three) and proceeds to function normally (permitting full
> > installation of linux on the disk and numerous -- warm -- reboots) until:
> > 
> > c) The first time I power it down.  When powered up, it recapitulates
> > pattern a) but this time I HAVE no WinNT on the hard disk to reset the
> > controller and cannot recover the system.  Or rather, one boot in ten or
> > thirty it appears to recover briefly but I cannot tell why.
> > 
> > The pattern appears to be stable although my statistics on c) are still
> > weak because I don't really want to trash my operating cluster to get
> > better ones.  Next major power out around here and I'm toast, though.
> > 
> > STRONGLY appears like there is some bug in the initialization sequence
> > that WinNT gets right and the linux aic7xxx gets wrong.
> > 
> > Is there hope for me here?  Is there anything I can do to help solve the
> > problem?  I have precisely one system left with WinNT preinstalled on
> > the hard drive to play with, so if any fixes come down I can test them.
> > I also have three boxes in the "dead" state -- I can probably run them
> > totally diskless (no aic7xxx driver at all) but obviously this wastes
> > some very expensive and actually fairly nice hardware...
> 
> There is hope.  For now, go to line 319 in the pre8 driver and comment out

Should be line 391 :-)

> the #  define MMAPIO inside of the #ifdef (__i386__) block.  This turns off
> MMAPed I/O and should get you booted up and running on these 2300 machines. 
> If that doesn't do the trick, then just hold on, I get my 2300 this
> Saturday.

I have this very same problem, cold-booting linux gives me strange parity errors,
booting wnt and then warm booting linux and everything is ok. But I don't have any
Dell poweredge 2300, this is a self made box:

# lspci

00:00.0 Host bridge: VIA Technologies, Inc. VT82C597 [Apollo VP3] (rev 01)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C597 [Apollo VP3 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C586 ISA [Apollo VP] (rev 41)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:09.0 Multimedia video controller: Brooktree Corporation Bt848 (rev 11)
00:0a.0 SCSI storage controller: Unknown device 9005:0010
01:05.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W AGP [Millennium II AGP]

# cat /proc/scsi/scsi

Attached devices: 
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: MATSHITA Model: CD-ROM CR-506    Rev: 8S04
  Type:   CD-ROM                           ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: IBM      Model: DCAS-34330W      Rev: S65A
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: IBM      Model: DDRS-39130W      Rev: S92A
  Type:   Direct-Access                    ANSI SCSI revision: 02

It's a FIC PA2012 (1M L2) mainboard,  VIA VP3 chipset, amd k6 233 cpu.

# cat /proc/scsi/aic7xxx/0

Adaptec AIC7xxx driver version: 5.1.0pre8/3.2.4
Compile Options:
  AIC7XXX_RESET_DELAY    : 5
  AIC7XXX_TAGGED_QUEUEING: Adapter Support Enabled
                             Check below to see which
                             devices use tagged queueing
  AIC7XXX_PAGE_ENABLE    : Enabled (This is no longer an option)
  AIC7XXX_PROC_STATS     : Disabled

Adapter Configuration:
           SCSI Adapter: Adaptec AHA-294X Ultra2 SCSI host adapter
                           Ultra2-SE Wide Controller
    Programmed I/O Base: f600
      Adaptec SCSI BIOS: Enabled
                    IRQ: 11
                   SCBs: Active 0, Max Active 2,
                         Allocated 15, HW 32, Page 255
             Interrupts: 26643
      BIOS Control Word: 0x18a6
   Adapter Control Word: 0x145d
   Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
     Ultra Enable Flags: 0x0000
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 0:
      {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255}
    Actual queue depth per device for aic7xxx host instance 0:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:
(scsi0:0:4:0)
  Device using Wide/Sync transfers at
  40.0 MByte/sec, offset 15
    Total transfers 3246 (2489 read;757 written)
      blks(512) rd=19310; blks(512) wr=50392


(scsi0:0:6:0)
  Device using Wide/Sync transfers at
  40.0 MByte/sec, offset 15
    Total transfers 23316 (11204 read;12112 written)
      blks(512) rd=144083; blks(512) wr=60348

# head -391 /usr/src/linux/drivers/scsi/aic7xxx.c | tail -1
/*#  define MMAPIO*/

The questions:

What can I do to help fix this? I'm one of these poor humans forced to have
wnt installed, so I am, currently and certanly only for a short time, a little
better off than Robert :-) Any suggestions on where to poke in the driver?

Why is the queue depth only 1 for my disks, I expect both of them to support
more. Or do I mis understand the ``Actual queue depth per device...´´ and
``Tag Queue Enable Flags:´´?

And least on topic; if I don't connect any disks to the aic adapter (Then I also 
plug in an 83c875 adapter, and connect the disks to it) It took an eternity for
the adapter firmware to ``boot´´ (scan the scsi bus during bios init).

Regards, Erik

--

CRASH(8)
        no fs
                A device has disappeared from the mounted-device table.
                Definitely hardware or software error.

                                Unix Version 7 Manual

To Unsubscribe: send mail to majordomo at FreeBSD.org
with "unsubscribe freebsd-aic7xxx" in the body of the message



More information about the aic7xxx mailing list