understanding CAM errors

From: mike tancsa <mike_at_sentex.net>
Date: Wed, 13 Mar 2024 14:43:53 UTC
On a RELENG_14 box I am stress testing a new file server and have a 
bunch of WD SSDs which are throwing odd errors under load.  Any idea 
what these might be ?  smartctl -t long finishes without error and the 
only counters incrementing are


SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0004  2            7  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            8  Device-to-host register FISes sent due to a COMRESET
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, 
non-CRC

Which imply something on the connection to the backplane or controller ? 
SSD firmware bug ? I dont seem to have them on the Samsung SSDs, just 
this new model of WD SSD :(

Device Model:     WD Blue SA510 2.5 1000GB
Serial Number:    240406800922
LU WWN Device Id: 5 001b44 8b334a313
Firmware Version: 52046100
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic
Device is:        Not in smartctl database 7.3/5528
ATA Version is:   ACS-4, ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Mar 13 10:42:39 2024 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 5e dc b8 c0 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 34 ed 6f 28 00 00 d0 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 50 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 40 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 70 cb 3e 78 00 01 00 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 48 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 70 cb 3d 78 00 01 00 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 34 ed 6e 28 00 01 00 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 2 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 00 f7 09 68 00 00 48 00
(da8:mpr0:0:18:0): CAM status: SCSI Status Error
(da8:mpr0:0:18:0): SCSI status: Check Condition
(da8:mpr0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, 
or bus device reset occurred)
(da8:mpr0:0:18:0): Retrying command (per sense data)
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 1168 loginfo 
31110f00
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 1468 loginfo 
31110f00
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fa 80 00 01 00 00
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 716 loginfo 
31110f00
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 877 loginfo 
31110f00
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 300 loginfo 
31110f00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 3 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fb 80 00 01 00 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 3 more tries remain
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 50 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 3 more tries remain
mpr0: Controller reported scsi ioc terminated tgt 18 SMID 2033 loginfo 
31110f00
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 48 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 3 more tries remain
(da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 30 00 00 08 00
(da8:mpr0:0:18:0): CAM status: CCB request completed with an error
(da8:mpr0:0:18:0): Retrying command, 3 more tries remain
(da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fa 80 00 01 00 00
(da8:mpr0:0:18:0): CAM status: SCSI Status Error
(da8:mpr0:0:18:0): SCSI status: Check Condition
(da8:mpr0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, 
or bus device reset occurred)
(da8:mpr0:0:18:0): Retrying command (per sense data)


Controller is

mpr0 Adapter:
        Board Name: INSPUR 3008IT
    Board Assembly: INSPUR
         Chip Name: LSISAS3008
     Chip Revision: ALL
     BIOS Revision: 18.00.00.00
Firmware Revision: 16.00.12.00
   Integrated RAID: no
          SATA NCQ: ENABLED
  PCIe Width/Speed: x8 (8.0 GB/sec)
         IOC Speed: Full
       Temperature: 41 C