[Bug 279978] After commit 25375b1415, any errors in device connected to ahci etc. results in Unretryable error
Date: Tue, 25 Jun 2024 04:04:06 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279978 Bug ID: 279978 Summary: After commit 25375b1415, any errors in device connected to ahci etc. results in Unretryable error Product: Base System Version: 14.1-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: aono@cc.osaka-kyoiku.ac.jp I have a (half-broken) HDD (ada2, connected to ahci1) with a FreeBSD-14.1 (p0) server in my office. > kernel: CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (2672.84-MHz K8-class CPU) > kernel: Origin="GenuineIntel" Id=0x106a4 Family=0x6 Model=0x1a Stepping=4 > kernel: ahci1: <Intel ICH10 AHCI SATA controller> port 0x7c00-0x7c07,0x7880-0x7883,0x7800-0x7807,0x7480-0x7483,0x7400-0x741f mem 0xf7ffc000-0xf7ffc7ff irq 20 at device 31.2 on pci0 > kernel: ahci1: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported > kernel: ahcich4: <AHCI channel> at channel 2 on ahci1 > kernel: ahciem0: <AHCI enclosure management bridge> on ahci1 > kernel: ses0 at ahciem0 bus 0 scbus9 target 0 lun 0 > kernel: ses0: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device > kernel: ses0: SEMB SES Device > kernel: ses0: ada2,pass2 in 'Slot 02', SATA Slot: scbus5 target 0 > kernel: ada2 at ahcich4 bus 0 scbus5 target 0 lun 0 > kernel: ada2: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device > kernel: ada2: Serial Number WD-WX41DA5LVRR4 > kernel: ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > kernel: ada2: Command Queueing enabled > kernel: ada2: 5723166MB (11721045168 512 byte sectors) > kernel: ada2: quirks=0x1<4K> When running read/write bad sector using dd (with 'sysctl kern.geom.debugflags=16'), Unretryable error occurs and cannot access to ada2 until I use 'camcontrol reset ada2'. > kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 68 da 57 40 b3 00 00 00 00 00 > kernel: (ada2:ahcich4:0:0:0): CAM status: Auto-Sense Retrieval Failed > kernel: (ada2:ahcich4:0:0:0): Error 5, Unretryable error When on FreeBSD-13.x, this error is retryable. (Following entries are past logs, sector/ACB differs.) > kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e8 1b df 40 1f 01 00 08 00 00 > kernel: (ada2:ahcich4:0:0:0): CAM status: ATA Status Error > kernel: (ada2:ahcich4:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) > kernel: (ada2:ahcich4:0:0:0): RES: 41 40 b0 1c df 00 1f 01 00 00 00 > kernel: (ada2:ahcich4:0:0:0): Retrying command, 3 more tries remain In commit 25375b1415, we changed as follows (/sys/dev/ahci/ahci.c only, probably this also affects to siis/mvs): diff --git a/sys/dev/ahci/ahci.c b/sys/dev/ahci/ahci.c index 12e6ee8102da..d62a043eb2ab 100644 --- a/sys/dev/ahci/ahci.c +++ b/sys/dev/ahci/ahci.c @@ -2178,7 +2178,8 @@ completeall: ahci_reset(ch); return; } - ccb->ccb_h = ch->hold[i]->ccb_h; /* Reuse old header. */ + xpt_setup_ccb(&ccb->ccb_h, ch->hold[i]->ccb_h.path, + ch->hold[i]->ccb_h.pinfo.priority); if (ccb->ccb_h.func_code == XPT_ATA_IO) { /* READ LOG */ ccb->ccb_h.recovery_type = RECOVERY_READ_LOG; Commit message say 'only field I see used from all the header is target_id.' But we need func_code in 'if' statement in NEXT line. func_code is always same value (probably 0), so 'if' statement never match condition (XPT_ATA_IO in above code), we always do 'REQUEST SENSE' in 'else' block. This is problematic. Copying more CCB header (at least func_code) or 'if' condition change (ex. 'if(ch->hold[i]->ccb.h.func_code == XPT_ATA_IO) { ...') would solve this issue. I modified adding xpt_merge_ccb() after xpt_setup_ccb() (booting with modified kernel seems to work fine), but I'm not sure if this is a right code. -- You are receiving this mail because: You are the assignee for the bug.