From nobody Wed Mar 13 17:07:48 2024 X-Original-To: freebsd-virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tvxm064ccz5DHrD for ; Wed, 13 Mar 2024 17:08:00 +0000 (UTC) (envelope-from mgrooms@shrew.net) Received: from mx1.shrew.net (mx1.shrew.net [204.27.62.57]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tvxlz38knz4K32 for ; Wed, 13 Mar 2024 17:07:59 +0000 (UTC) (envelope-from mgrooms@shrew.net) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=shrew.net header.s=default header.b=UFS8vTsT; dmarc=none; spf=pass (mx1.freebsd.org: domain of mgrooms@shrew.net designates 204.27.62.57 as permitted sender) smtp.mailfrom=mgrooms@shrew.net Received: from mail.shrew.net (mail1.shrew.prv [10.26.2.18]) by mx1.shrew.net (8.17.1/8.17.1) with ESMTP id 42DH7nsh092518 for ; Wed, 13 Mar 2024 12:07:49 -0500 (CDT) (envelope-from mgrooms@shrew.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shrew.net; s=default; t=1710349669; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=nL7VQ7sVP/+Gjikzl5XMJbbTbzVny8mF0zRndYzkmV8=; b=UFS8vTsT7UkQpOovmNMQ0UfDrpbJ+9q5F2ZS/R3St5iVg/RUdeFQ3B8kM4eHyQS25i0iKc zvHTD5c3OF2o4FMFwX6mExfCWDN8JuGtOVovZcfr/pwp+1kZu75V9iRrqk0aFTOYfzdT7a oeg+5C0ibwrRr4qnN5dnuFwBG1FJquiieOccsu2ReyEy6XNLvdypbRMD/Lvo4mxCMH9B5s Ewcnnr38aHC+bfd5LeInzedjOV6bJWARyTnsSXO6Q2ldoHiphmlki94myzbI7pKqs2GAwR yThbmOBVaayYsVTxJU6FV72azXBY4ggu56vJyGD1wI0zBP1rds1pt06Ywm6peQ== Received: from [10.22.200.32] (unknown [136.62.156.42]) by mail.shrew.net (Postfix) with ESMTPSA id 32CB43AB37 for ; Wed, 13 Mar 2024 12:07:49 -0500 (CDT) Message-ID: Date: Wed, 13 Mar 2024 12:07:48 -0500 List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: FreeBSD virtualization From: Matthew Grooms Subject: TRIM visibility bugs [patches] Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.49 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; R_SPF_ALLOW(-0.20)[+mx]; R_DKIM_ALLOW(-0.20)[shrew.net:s=default]; MIME_GOOD(-0.10)[text/plain]; XM_UA_NO_VERSION(0.01)[]; RCPT_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ASN(0.00)[asn:19969, ipnet:204.27.56.0/21, country:US]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DMARC_NA(0.00)[shrew.net]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-virtualization@freebsd.org]; PREVIOUSLY_DELIVERED(0.00)[freebsd-virtualization@freebsd.org]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[shrew.net:+] X-Rspamd-Queue-Id: 4Tvxlz38knz4K32 Hey All, While toying with different storage options to use with bhyve, I've run across a few frustrations related to visibility into TRIM operations. Specifically, it's not always easy to tell if trim operations are being processed by drives due to the counters not being updated by the SCSI da device. The counters exported as sysctl values are updated by the SCCI UNMAP method, but not by the TRIM or WS methods. You can see the BIO_DELETE operations being processed in real time by running the gstat -d command, but the sysctl counters never change from 0. This gives the false illusion that nothing is happening. With the following patch, I'm now able to see the counters reflect trim operations that are processed by da devices ... mgrooms@mgrooms:~/trim $ cat scsi_da.diff --- scsi_da.c.orig    2024-03-13 11:32:32.098922000 -0500 +++ scsi_da.c    2024-03-13 11:31:37.255187000 -0500 @@ -4197,6 +4197,9 @@                da_default_timeout * 1000);      ccb->ccb_h.ccb_state = DA_CCB_DELETE;      ccb->ccb_h.flags |= CAM_UNLOCKED; +    softc->trim_count++; +    softc->trim_ranges += ranges; +    softc->trim_lbas += block_count;      cam_iosched_submit_trim(softc->cam_iosched);  } @@ -4257,6 +4260,8 @@              da_default_timeout * 1000);      ccb->ccb_h.ccb_state = DA_CCB_DELETE;      ccb->ccb_h.flags |= CAM_UNLOCKED; +    softc->trim_count++; +    softc->trim_lbas += count;      cam_iosched_submit_trim(softc->cam_iosched);  } Additionally, while attempting to test geom mirror+stripe to provide software RAID10, the diskinfo utility reports that a mirror supports UNMAP/TRIM when at least one underlying devices supports it, but a stripe does not. I added a small patch that attempts to use the same logic as mirror so that it will report that UNMAP/TRIM is supported when one of the underlying devices does ... --- g_stripe.c.orig    2024-03-12 18:23:52.960025000 -0500 +++ g_stripe.c    2024-03-12 18:25:01.009378000 -0500 @@ -26,6 +26,7 @@   * SUCH DAMAGE.   */ +#include  #include  #include  #include @@ -568,7 +569,7 @@      off_t offset, start, length, nstripe, stripesize;      struct g_stripe_softc *sc;      u_int no; -    int error, fast = 0; +    int error, fast = 0, val = 0;      sc = bp->bio_to->geom->softc;      /* @@ -591,6 +592,12 @@          g_stripe_pushdown(sc, bp);          return;      case BIO_GETATTR: +        if (!strcmp(bp->bio_attribute, "GEOM::candelete")) { +            if (sc->sc_flags & G_STRIPE_FLAG_CANDELETE) +                val = 1; +            g_handleattr(bp, "GEOM::candelete", &val, sizeof(val)); +            return; +        }          /* To which provider it should be delivered? */      default:          g_io_deliver(bp, EOPNOTSUPP); @@ -745,7 +752,7 @@  {      struct g_consumer *cp, *fcp;      struct g_geom *gp; -    int error; +    int error, i;      g_topology_assert();      /* Metadata corrupted? */ @@ -792,8 +799,19 @@              goto fail;          }      } -      sc->sc_disks[no] = cp; + +    /* cascade candelete */ +    error = g_access(cp, 1, 0, 0); +    if (error == 0) +    { +        error = g_getattr("GEOM::candelete", cp, &i); +        if (error == 0 && i != 0) +            sc->sc_flags |= G_STRIPE_FLAG_CANDELETE; +        G_STRIPE_DEBUG(1, "Provider %s candelete %i.", pp->name, i); +        g_access(cp, -1, 0, 0); +    } +      G_STRIPE_DEBUG(0, "Disk %s attached to %s.", pp->name, sc->sc_name);      g_stripe_check_and_run(sc); --- g_stripe.h.orig    2024-03-12 18:24:00.960741000 -0500 +++ g_stripe.h    2024-03-12 12:25:22.842925000 -0500 @@ -47,6 +47,8 @@  #define    G_STRIPE_TYPE_MANUAL    0  #define    G_STRIPE_TYPE_AUTOMATIC    1 +#define    G_STRIPE_FLAG_CANDELETE        0x00000001UL +  #define    G_STRIPE_DEBUG(lvl, ...) \      _GEOM_DEBUG("GEOM_STRIPE", g_stripe_debug, (lvl), NULL, __VA_ARGS__)  #define    G_STRIPE_LOGREQ(bp, ...) \ @@ -61,6 +63,7 @@      uint16_t     sc_ndisks;      off_t         sc_stripesize;      uint32_t     sc_stripebits; +    uint32_t     sc_flags;      struct mtx     sc_lock;  };  #define    sc_name    sc_geom->name Fair warning: I'm not a CAM or GEOM developer, so these should be reviewed before someone before they are applied anywhere that counts. In any case, I wanted to share this but here as I've seen some internet posts from other folks setting up virtual storage that ran into similar problems. I've also opened a bug report so that hopefully these visibility issues get fixed ... https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277673 Thanks, -Matthew