No retries after periph invalidation?

Alexander Motin mav at FreeBSD.org
Sat Jul 23 22:37:22 UTC 2011


Hi.

I've simulated one real world device failure condition, when SATA disk
still reports its presence, but doesn't respond to any command. I've
found that due to multiple command retries, each of which cause 30s
timeout, bus reset and another retry/requeue, it may take ages to
eventually drop the failed device. Odd thing that those retries continue
even after XPT considered device lost and invalidated it.

I've made a patch (http://people.freebsd.org/~mav/periph_noretry.patch)
for cam_periph_error() to block any retries after periph was marked as
invalid. With that patch all activity completes in 1-2 minutess, just
after several timeouts, required to consider device loss.

Can this way considered to be correct?

-- 
Alexander Motin


More information about the freebsd-scsi mailing list