ata timeouts under load

Andriy Gapon avg at icyb.net.ua
Mon Sep 14 10:44:23 UTC 2009


on 14/09/2009 00:02 Kris Kennaway said the following:
> 
> It's always that sequence (with setfeatures timing out first, then the
> dma later)...and the block number varies widely, also whether it's
> read/write.  The disk itself & the data it contains appears to be OK as
> far as I have been able to determine so far.

I also sometimes see something similar when I put very high load with a specific
pattern on my two-disk mirrored zpool.
The pattern is zpool scrub plus additional load like untarring large archives.

Example:
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly
kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815
kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568159071
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly
kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568158815
kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568159071
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
completing request directly
kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly
kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568158815
kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568159071
root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284457041920
size=131072 error=5
root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284456910848
size=131072 error=5


But I also see cases where dma timeout message appears first:
ad10: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE>
LBA=568157535
ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request
directly
ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request
directly
ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158559
ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815

No errors happen whatsoever if I run scrub without any additional load, or if I do
any 'typical' disk loads without parallel scrubbing.

-- 
Andriy Gapon


More information about the freebsd-current mailing list