From nobody Sat Jul 17 13:36:22 2021 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 3A4C1127C29B for ; Sat, 17 Jul 2021 13:36:26 +0000 (UTC) (envelope-from doa379@gmail.com) Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4GRpzZ0f1nz4h5H for ; Sat, 17 Jul 2021 13:36:25 +0000 (UTC) (envelope-from doa379@gmail.com) Received: by mail-wr1-x435.google.com with SMTP id r11so15349949wro.9 for ; Sat, 17 Jul 2021 06:36:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=P70ZzPveu8j8F8242hIZoDqPtyi1f4fZZzX5dXGe9QE=; b=vbbO8vz2DyN4LrGNVdbths48gMSAlMdqENav7ZgtY3ZOvX5IJVmuXz51HUx31+Lu45 gbyKvqGJYLxnLOpgXEcAipXigxJsAA3vdFxwzJs7W20fZOssEp08bNuuxHcQhfiJkWrI tKgd/eSgopEMRI9dciKp45ir1m1TcfOfXHZAYzPcT/eReWAokLyhzydENCf0Faexuyyn /ZNLUIUgNiLAeRPjtjhhQnGUKYBN0Ck8Ay7GPI6da6+KW4vJwQCbO3QTAOLKsdL3gABl onZWq4l+mlsCoAfgGTOtE9NE1ffCm5dTkMWpzkfwX4QhfOkXcWCWAjofWEBIKgnxpxiT mXmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=P70ZzPveu8j8F8242hIZoDqPtyi1f4fZZzX5dXGe9QE=; b=UoxG5tt+alNu7aR54B5fML53/lYZKC1dC3xLwm9Vw/L+Om5s/NjMCpamFmI/19cjBd tAUL/HINq3rfIyz/q0iExacXOJf+C8z2nfYkkmxV1fVvfHzJwdc0EM1q7DGkYY+XfnaZ QVSce9Ir8+ieoD8givGSTN00vHs2KODXOgAonnuSxWlv5TKRVUL/DaDYTqfiR6JDVNB1 30ZLAV3n83D03+io3DPebTM8W9/b0JjsJssdCqqBOBwpTrlH1FqB0H8IRK86V1zJMZXS QjkkG1cNdy8jEYRw5d0k4PFxCHkJwUzICU2ftO2hKoPHuRrnKlRdahTRNTcu65TayptC Mq0g== X-Gm-Message-State: AOAM533I/xoSBgtrYH6SG/69mUFI7lS0gPp72LQjHV2I2UJiy9D5SmE2 NoRjC7Tr0uh8LZsndbIC3qI= X-Google-Smtp-Source: ABdhPJz1m9clET1O5WY+a5TXjbRsZ5EoeGuv0/8BPqacVZO54b5SMbbX4PuECrtnNBGSppG67G3Leg== X-Received: by 2002:adf:f946:: with SMTP id q6mr18570634wrr.283.1626528984658; Sat, 17 Jul 2021 06:36:24 -0700 (PDT) Received: from localhost (cpc95516-derb17-2-0-cust943.8-3.cable.virginm.net. [82.5.115.176]) by smtp.gmail.com with ESMTPSA id o7sm15662693wrv.72.2021.07.17.06.36.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 17 Jul 2021 06:36:24 -0700 (PDT) Date: Sat, 17 Jul 2021 14:36:22 +0100 From: doa379 To: Graham Perrin Cc: Current FreeBSD Subject: Re: nvme(4) losing control, and subsequent use of fsck_ffs(8) with UFS Message-ID: References: <994d22b5-c8b7-1183-8198-47b8251e896e@gmail.com> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <994d22b5-c8b7-1183-8198-47b8251e896e@gmail.com> X-Rspamd-Queue-Id: 4GRpzZ0f1nz4h5H X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N > When the file system is stress-tested, it seems that the device (an internal > drive) is lost. > > A recent photograph: > > > > Transcribed manually: > > nvme0: Resetting controller due to a timeout. > nvme0: resetting controller > nvme0: controller ready did not become 0 within 5500 ms > nvme0: failing outstanding i/o > nvme0: WRITE sqid:2 cid:115 nsid:1 lba:296178856 len:64 > nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:115 cdw0:0 > g_vfs_done():nvd0p2[WRITE(offset=151370924032, length=32768)]error = 6 > UFS: forcibly unmounting /dev/nvd0p2 from / > nvme0: failing outstanding i/o > > … et cetera. > > Is this a sure sign of a hardware problem? Or must I do something special to > gain reliability under stress? > > I don't how to interpret parts of the manual page for nvme(4). There's > direction to include this line in loader.conf(5): > > nvme_load="YES" > > – however when I used kldload(8), it seemed that the module was already > loaded, or in kernel. > > Using StressDisk: > > > > – failures typically occur after around six minutes of testing. > > The drive is very new, less than 2 TB written: > > > > I do suspect a hardware problem, because two prior installations of Windows > 10 became non-bootable. > > Also: I find peculiarities with use of fsck_ffs(8), which I can describe > later. Maybe to be expected, if there's a problem with the drive. > > I have a similar issue with a system that runs off a USB drive. The fs is UFS. The system does minimal disk io but the system fails without warning on repeated intervals. The disk controller gets disconnected thereby taking the whole system offline. I'm sure the drive itself is not perfect but I'd have expected the fs to account for that.