From nobody Thu Jul 13 15:33:08 2023 X-Original-To: freebsd-embedded@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4R1zCJ2j5Hz2tr3w for ; Thu, 13 Jul 2023 15:33:16 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smarthost1.sentex.ca", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4R1zCH1k6bz3PKF for ; Thu, 13 Jul 2023 15:33:15 +0000 (UTC) (envelope-from mike@sentex.net) Authentication-Results: mx1.freebsd.org; dkim=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:1::12 as permitted sender) smtp.mailfrom=mike@sentex.net; dmarc=none Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [199.212.134.19]) by smarthost1.sentex.ca (8.17.1/8.16.1) with ESMTPS id 36DFX8XS074194 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=FAIL) for ; Thu, 13 Jul 2023 11:33:08 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [IPV6:2607:f3e0:0:4::29] ([IPv6:2607:f3e0:0:4:0:0:0:29]) by pyroxene2a.sentex.ca (8.16.1/8.15.2) with ESMTPS id 36DFX8bS061163 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO) for ; Thu, 13 Jul 2023 11:33:08 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <709521ba-5719-5f80-10bf-1de05d99d5c1@sentex.net> Date: Thu, 13 Jul 2023 11:33:08 -0400 List-Id: Dedicated and Embedded Systems List-Archive: https://lists.freebsd.org/archives/freebsd-embedded List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-embedded@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Content-Language: en-US To: freebsd-embedded From: mike tancsa Subject: SD card corruption Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 X-Spamd-Result: default: False [-3.37 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.998]; NEURAL_HAM_SHORT(-1.00)[-0.998]; NEURAL_HAM_LONG(-0.98)[-0.978]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; MIME_GOOD(-0.10)[text/plain]; RCVD_IN_DNSWL_LOW(-0.10)[199.212.134.19:received]; FROM_HAS_DN(0.00)[]; R_DKIM_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MLMMJ_DEST(0.00)[freebsd-embedded@freebsd.org]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEFALL_USER(0.00)[mike]; RCVD_COUNT_THREE(0.00)[3]; PREVIOUSLY_DELIVERED(0.00)[freebsd-embedded@freebsd.org]; TO_DN_ALL(0.00)[]; DMARC_NA(0.00)[sentex.net]; MID_RHS_MATCH_FROM(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 4R1zCH1k6bz3PKF X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N TL;DR. We get batches of cards that suddenly fail with SD card wide file corruption out of the blue. A little background.  We have APUs (PCEngines) in the field that work REALLY well for reliability.  However, the odd time that things go south, its due to SD cards.  I had a couple of devices last week fail after about a year and when I got them back both had hundreds of fsck errors. These are devices that stay mounted Read Only so there are no writes to them. Even on the second partition of the nanobsd image which was never mounted had many fsck errors.  Normally we use SanDisk but had to switch to some PNY due to supply chain issues.  The PNY seem to be more failure prone than the SanDisk, but we do get the odd SanDisk too with the same pathology. Once I get the bad SD card back, I can newfs it and all is fine. e.g. I can fill the disk with 16GB of /dev/urandom files and the hashes all match over time. Is it just bad hardware / bad luck that is causing these seemingly catastrophic failures or are there things that should be done in the field to extend the life of SD cards ? Is there any way to predict these failures in advance ? If I newfs -E (does the -E make a difference?) the unused partition and then re-write it with the live image and then boot to the new partition, does that buy my any longevity ?     ---Mike