Swapping deadlock due to aic/scsi errors?

Dave Dolson ddolson at sandvine.com
Wed Aug 6 14:59:51 PDT 2003


 
> > We have a reproducible bug characterized by the system
> > becoming unresponsive (but db may be entered).
> > System is based on FreeBSD 4.7 (i386)
> > Using the aic79xx scsi driver.
> 
> If you are using the stock aic79xx driver found in 4.7, I would
> start by pulling in the latest 4.X aic79xx driver into your system.

Yes, we are using the latest RELENG_4 driver.

> > I would like to add some debugging to detect the lost command 
> > and possibly retry it.  Can someone suggest where the lost
> > command is supposed to be detected, and where the retry is 
> > supposed to occur.
> 
> The "lost command" is supposed to be detected by the timeout
> handler in the ahd driver.  The timeout handler just forces
> a bus reset which should cause the command to be returned to
> the SCSI layer and then retried.  It's not clear to me why
> this might not be happening, but the ahd driver was relatively
> green in 4.7 and you may just be tripping over a known (and
> later corrected) bug manifesting itself in an unusual way.

Are you referring to the timeout handler ahd_timeout() ?
Are the commmands retried from ahd_reset_channel() ?
(It looks more like they're simply aborted.)

Aside: Am I correct in believing that ahd_execute_scb() is called 
for every command to the drive?

David Dolson (ddolson at sandvine.com, www.sandvine.com)



More information about the freebsd-scsi mailing list