From nobody Wed Mar 13 14:43:53 2024 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TvtYs466Qz5D24R for ; Wed, 13 Mar 2024 14:44:01 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smarthost1.sentex.ca", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TvtYr6kvLz42Gg for ; Wed, 13 Mar 2024 14:44:00 +0000 (UTC) (envelope-from mike@sentex.net) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:1::12 as permitted sender) smtp.mailfrom=mike@sentex.net Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [199.212.134.19]) by smarthost1.sentex.ca (8.17.1/8.16.1) with ESMTPS id 42DEhsKw090529 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=FAIL) for ; Wed, 13 Mar 2024 10:43:54 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [IPV6:2607:f3e0:0:4:511c:fa42:1f3d:a58] ([IPv6:2607:f3e0:0:4:511c:fa42:1f3d:a58]) by pyroxene2a.sentex.ca (8.17.1/8.15.2) with ESMTPS id 42DEhr7u081424 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO) for ; Wed, 13 Mar 2024 10:43:53 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <89796e74-439d-4267-a13d-be4bdfac937f@sentex.net> Date: Wed, 13 Mar 2024 10:43:53 -0400 List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: FreeBSD Questions From: mike tancsa Subject: understanding CAM errors Autocrypt: addr=mike@sentex.net; keydata= xsBNBFywzOMBCACoNFpwi5MeyEREiCeHtbm6pZJI/HnO+wXdCAWtZkS49weOoVyUj5BEXRZP xflV2ib2hflX4nXqhenaNiia4iaZ9ft3I1ebd7GEbGnsWCvAnob5MvDZyStDAuRxPJK1ya/s +6rOvr+eQiXYNVvfBhrCfrtR/esSkitBGxhUkBjOti8QwzD71JVF5YaOjBAs7jZUKyLGj0kW yDg4jUndudWU7G2yc9GwpHJ9aRSUN8e/mWdIogK0v+QBHfv/dsI6zVB7YuxCC9Fx8WPwfhDH VZC4kdYCQWKXrm7yb4TiVdBh5kgvlO9q3js1yYdfR1x8mjK2bH2RSv4bV3zkNmsDCIxjABEB AAHNHW1pa2UgdGFuY3NhIDxtaWtlQHNlbnRleC5uZXQ+wsCOBBMBCAA4FiEEmuvCXT0aY6hs 4SbWeVOEFl5WrMgFAl+pQfkCGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQeVOEFl5W rMiN6ggAk3H5vk8QnbvGbb4sinxZt/wDetgk0AOR9NRmtTnPaW+sIJEfGBOz47Xih+f7uWJS j+uvc9Ewn2Z7n8z3ZHJlLAByLVLtcNXGoRIGJ27tevfOaNqgJHBPbFOcXCBBFTx4MYMM4iAZ cDT5vsBTSaM36JZFtHZBKkuFEItbA/N8ZQSHKdTYMIA7A3OCLGbJBqloQ8SlW4MkTzKX4u7R yefAYQ0h20x9IqC5Ju8IsYRFacVZconT16KS81IBceO42vXTN0VexbVF2rZIx3v/NT75r6Vw 0FlXVB1lXOHKydRA2NeleS4NEG2vWqy/9Boj0itMfNDlOhkrA/0DcCurMpnpbM7ATQRcsMzk AQgA1Dpo/xWS66MaOJLwA28sKNMwkEk1Yjs+okOXDOu1F+0qvgE8sVmrOOPvvWr4axtKRSG1 t2QUiZ/ZkW/x/+t0nrM39EANV1VncuQZ1ceIiwTJFqGZQ8kb0+BNkwuNVFHRgXm1qzAJweEt RdsCMohB+H7BL5LGCVG5JaU0lqFU9pFP40HxEbyzxjsZgSE8LwkI6wcu0BLv6K6cLm0EiHPO l5G8kgRi38PS7/6s3R8QDsEtbGsYy6O82k3zSLIjuDBwA9GRaeigGppTxzAHVjf5o9KKu4O7 gC2KKVHPegbXS+GK7DU0fjzX57H5bZ6komE5eY4p3oWT/CwVPSGfPs8jOwARAQABwsB2BBgB CAAgFiEEmuvCXT0aY6hs4SbWeVOEFl5WrMgFAl+pQfkCGwwACgkQeVOEFl5WrMiVqwf9GwU8 c6cylknZX8QwlsVudTC8xr/L17JA84wf03k3d4wxP7bqy5AYy7jboZMbgWXngAE/HPQU95NM aukysSnknzoIpC96XZJ0okLBXVS6Y0ylZQ+HrbIhMpuQPoDweoF5F9wKrsHRoDaUK1VR706X rwm4HUzh7Jk+auuMYfuCh0FVlFBEuiJWMLhg/5WCmcRfiuB6F59ZcUQrwLEZeNhF2XJV4KwB Tlg7HCWO/sy1foE5noaMyACjAtAQE9p5kGYaj+DuRhPdWUTsHNuqrhikzIZd2rrcMid+ktb0 NvtvswzMO059z1YGMtGSqQ4srCArju+XHIdTFdiIYbd7+jeehg== Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.86 on 64.7.153.18 X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.38 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.993]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; MIME_GOOD(-0.10)[text/plain]; RCVD_IN_DNSWL_LOW(-0.10)[199.212.134.19:received]; XM_UA_NO_VERSION(0.01)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; FREEFALL_USER(0.00)[mike]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; MIME_TRACE(0.00)[0:+]; MID_RHS_MATCH_FROM(0.00)[]; R_DKIM_NA(0.00)[]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DMARC_NA(0.00)[sentex.net]; RCVD_TLS_ALL(0.00)[] X-Rspamd-Queue-Id: 4TvtYr6kvLz42Gg On a RELENG_14 box I am stress testing a new file server and have a bunch of WD SSDs which are throwing odd errors under load.  Any idea what these might be ?  smartctl -t long finishes without error and the only counters incrementing are SATA Phy Event Counters (GP Log 0x11) ID      Size     Value  Description 0x0001  2            0  Command failed due to ICRC error 0x0004  2            7  R_ERR response for host-to-device data FIS 0x0006  2            0  R_ERR response for device-to-host non-data FIS 0x0007  2            0  R_ERR response for host-to-device non-data FIS 0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy 0x000a  2            8  Device-to-host register FISes sent due to a COMRESET 0x000f  2            0  R_ERR response for host-to-device data FIS, CRC 0x0013  2            0  R_ERR response for host-to-device non-data FIS, non-CRC Which imply something on the connection to the backplane or controller ? SSD firmware bug ? I dont seem to have them on the Samsung SSDs, just this new model of WD SSD :( Device Model:     WD Blue SA510 2.5 1000GB Serial Number:    240406800922 LU WWN Device Id: 5 001b44 8b334a313 Firmware Version: 52046100 User Capacity:    1,000,204,886,016 bytes [1.00 TB] Sector Size:      512 bytes logical/physical Rotation Rate:    Solid State Device Form Factor:      2.5 inches TRIM Command:     Available, deterministic Device is:        Not in smartctl database 7.3/5528 ATA Version is:   ACS-4, ACS-2 T13/2015-D revision 3 SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:    Wed Mar 13 10:42:39 2024 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 5e dc b8 c0 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 34 ed 6f 28 00 00 d0 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 50 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 40 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 70 cb 3e 78 00 01 00 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): READ(10). CDB: 28 00 50 1a 59 48 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 70 cb 3d 78 00 01 00 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 34 ed 6e 28 00 01 00 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 2 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 00 f7 09 68 00 00 48 00 (da8:mpr0:0:18:0): CAM status: SCSI Status Error (da8:mpr0:0:18:0): SCSI status: Check Condition (da8:mpr0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) (da8:mpr0:0:18:0): Retrying command (per sense data) mpr0: Controller reported scsi ioc terminated tgt 18 SMID 1168 loginfo 31110f00 mpr0: Controller reported scsi ioc terminated tgt 18 SMID 1468 loginfo 31110f00 (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fa 80 00 01 00 00 mpr0: Controller reported scsi ioc terminated tgt 18 SMID 716 loginfo 31110f00 mpr0: Controller reported scsi ioc terminated tgt 18 SMID 877 loginfo 31110f00 mpr0: Controller reported scsi ioc terminated tgt 18 SMID 300 loginfo 31110f00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 3 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fb 80 00 01 00 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 3 more tries remain (da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 50 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 3 more tries remain mpr0: Controller reported scsi ioc terminated tgt 18 SMID 2033 loginfo 31110f00 (da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 48 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 3 more tries remain (da8:mpr0:0:18:0): READ(10). CDB: 28 00 30 01 34 30 00 00 08 00 (da8:mpr0:0:18:0): CAM status: CCB request completed with an error (da8:mpr0:0:18:0): Retrying command, 3 more tries remain (da8:mpr0:0:18:0): WRITE(10). CDB: 2a 00 0b 6b fa 80 00 01 00 00 (da8:mpr0:0:18:0): CAM status: SCSI Status Error (da8:mpr0:0:18:0): SCSI status: Check Condition (da8:mpr0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) (da8:mpr0:0:18:0): Retrying command (per sense data) Controller is mpr0 Adapter:        Board Name: INSPUR 3008IT    Board Assembly: INSPUR         Chip Name: LSISAS3008     Chip Revision: ALL     BIOS Revision: 18.00.00.00 Firmware Revision: 16.00.12.00   Integrated RAID: no          SATA NCQ: ENABLED  PCIe Width/Speed: x8 (8.0 GB/sec)         IOC Speed: Full       Temperature: 41 C