From nobody Fri Dec 27 22:03:45 2024 X-Original-To: stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YKff72Pk2z5jQRc for ; Fri, 27 Dec 2024 22:03:59 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YKff63hXNz4lk3 for ; Fri, 27 Dec 2024 22:03:58 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20230601.gappssmtp.com header.s=20230601 header.b=QeKFRwKV; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::630) smtp.mailfrom=wlosh@bsdimp.com; dmarc=none Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-216426b0865so91952945ad.0 for ; Fri, 27 Dec 2024 14:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1735337037; x=1735941837; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PI7l7Usi1gcnMx750h/lO6QUHVuIe/JphorWm6Ih06I=; b=QeKFRwKVknDk0YC8YjdzCWPOhjQeUts1M2DuIqAWRTOWQqWi+qtlbetAaCsJDInYbx b7ZfEEJIpCJZlNscJITnc9aiH/ElrKFUqQSJowJ4OZaRroas5+LraNTK8XLIhqusq6FE Y+U26Yd8sba668n8ywBlj78UguHkAmUKL05BaaU4cl53Gjedd+UNfod5bg+URpyulOhh vbr5TA8+NSH2kcbgwwtMxljbi3xpQEAybhCkOsgOXdRVjlremLHQACWHM9PYa9cFN+vO vMSybJKTWkkQEILESM18Z3XPPc2zinHleraobPGS1SFjeWdPZHh6NatLF1ahgRxGBjBm Dt+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735337037; x=1735941837; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PI7l7Usi1gcnMx750h/lO6QUHVuIe/JphorWm6Ih06I=; b=UWcQav/uo0hFyV0zgvbi55wQjIMxb2KJUs/LakwX/TqkEyZEYs3bKCz6wK79ULZ8g7 0N67FByYXcNJdqvcrRmieM4+tRRN7DXE8ccihim4Yo5YWzIWJ8mSf7wGWy4+2unGxLkz KtneJH/0mXCKHLKA321q+UMpCxfpuwJnxm5EcJMCKvW8UlauRxpVNCuzf1ou+sBddkKz gppfNYnq9Co79ckfxwP7isYOiwsj46sall/NwNhzRmezFg/xTiSYS6S3n4YZ8/fa/w+m o9pjsmUt9PLpGlKZjTHYuiJdXieRNdCHRTgzqPKuvOiWd+1qG9Z6N+HvMmE7y4d2vlU+ VXRw== X-Forwarded-Encrypted: i=1; AJvYcCXtu9IP9RiutR492MCN/UncDqYsHCoEdyVIYc75DgGTPjeZyDvz5hKcwQPhKqjirTDcswB8qWQ=@freebsd.org X-Gm-Message-State: AOJu0YzkFzTG5OG2F8PYtn9nY1MeOxmOJCQiF8Z5TW/UJ616S5XSx2di qfFnvHqvF/+bHx55IgY2nciOcchBJQseKV8JGqbPLsHN6nON9j7rEbVBSPS36T1OD0SBekb5oU8 i9NckuEcxGU+2eM6gQZObebjskN76sfWLhpdw3g== X-Gm-Gg: ASbGncteg4M4ttF/rGSTPfx9YpTDf6jCUD0rxRQ20qqoB2Kx49UfLIWXxVWnjAhxoRX 446xwr1q5f7XmozOTuOuOUEA0PXfRtoZzUcw6ng== X-Google-Smtp-Source: AGHT+IHppcG/vFVfFiOHQD5z+3UnaF1XkRTmzASQMNksKD0RAriUiXkb1NxyFIStC2R3akLcMEKNDubpm9nMOE3XCQ0= X-Received: by 2002:a17:903:25d2:b0:215:ba2b:cd51 with SMTP id d9443c01a7336-219e6e9fdd8mr322711455ad.15.1735337036979; Fri, 27 Dec 2024 14:03:56 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-stable@freebsd.org Sender: owner-freebsd-stable@FreeBSD.org MIME-Version: 1.0 References: <0DDE1B66-B794-472D-A901-54FA2FF1E853@FreeBSD.org> In-Reply-To: <0DDE1B66-B794-472D-A901-54FA2FF1E853@FreeBSD.org> From: Warner Losh Date: Fri, 27 Dec 2024 15:03:45 -0700 Message-ID: Subject: Re: MFC fixes for uninitialized kernel stack variables in sys/cam or do direct fix for pvscsi driver To: Zhenlei Huang Cc: Warner Losh , stable@freebsd.org, Edward Tomasz Napierala Content-Type: multipart/alternative; boundary="000000000000a7bbda062a47a309" X-Spamd-Result: default: False [-3.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20230601.gappssmtp.com:s=20230601]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; MISSING_XM_UA(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::630:from]; DMARC_NA(0.00)[bsdimp.com]; MLMMJ_DEST(0.00)[stable@freebsd.org]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; R_SPF_NA(0.00)[no SPF record]; PREVIOUSLY_DELIVERED(0.00)[stable@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_LAST(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20230601.gappssmtp.com:+] X-Rspamd-Queue-Id: 4YKff63hXNz4lk3 X-Spamd-Bar: -- --000000000000a7bbda062a47a309 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Dec 2, 2024 at 2:41=E2=80=AFAM Zhenlei Huang wro= te: > Hi Warner, > > Recently I upgraded some ESXi vms from 13.3 to 13.4 and noticed weird > report for sas speed. > The boot console has the following, > > ``` > da0 at pvscsi0 bus 0 scbus2 target 0 lun 0 > da0: Fixed Direct Access SPC-4 SCSI device > da0: 4294967.295MB/s transfers > ``` > But camcontrol report the correct value, > ``` > # camcontrol inquiry da0 -R > pass1: 750.000MB/s transfers, Command Queueing Enabled > ``` > > The `4294967.295MB` is actually 0xffff_ffff or -1 but I do not see any > logic set those values. > > Finally I managed to get the stack trace, > ``` > _scsi_announce_periph > scsi_announce_periph_sbuf > xpt_announce_periph_sbuf > dadone_proberc > xpt_done_process > xpt_done_td > fork_exit > fork_trampoline > ``` > > and noticed that the last param `cts` of `_scsi_announce_periph(struct > cam_periph *periph, u_int *speed, u_int *freq, struct ccb_trans_settings > *cts)` > is from kernel stack and is not properly initialized, latter I found some > commits related to this, > > 076686fe0703 cam: make sure to clear CCBs allocated on the stack > ec5325dbca62 cam: make sure to clear even more CCBs allocated on the stac= k > 0f206cc91279 cam: add missing zeroing of a stack-allocated CCB. > 616a676a0535 cam: clear stack-allocated CCB in the target layer > > I applied them to stable/13, rebuild and reboot, now the speed of da0 is > reported correctly. I also tried to patch the pvscsi driver with few line= s > and > it also works as intended. > > ``` > --- a/sys/dev/vmware/pvscsi/pvscsi.c > +++ b/sys/dev/vmware/pvscsi/pvscsi.c > @@ -1444,6 +1444,10 @@ pvscsi_action(struct cam_sim *sim, union ccb *ccb) > cts->proto_specific.scsi.flags =3D CTS_SCSI_FLAGS_TAG_ENB= ; > cts->proto_specific.scsi.valid =3D CTS_SCSI_VALID_TQ; > > + /* Prefer connection speed over sas port speed */ > + /* cts->xport_specific.sas.bitrate =3D 0; */ > + cts->xport_specific.sas.valid =3D 0; > + > ccb_h->status =3D CAM_REQ_CMP; > xpt_done(ccb); > ``` > > Things come clear and I know why this weird speed happens, now it is time > to decide how to fix it. > > Fixing the consumer of cam, aka pvscsi driver, is quite simple and > promising. I did a quick search it appears other consumers set > `cts->xport_specific.sas.valid` correctly. It does not convince me as I'm > quite new to cam subsystem. > Yes. sas.valid is set when the sas.bitrate and other data has been set correctly. Setting it to 0 like this ensures it's ignored. So if you know the speed, set sas.bitrate to that speed and sas.valid to 1. I'm not sure I answered the question right, but if not, please clarify or point out what I missed and I'll try again. > Which one do you prefer, MFC commits to stable/13, or do direct fix for > pvscsi driver to stable/13 ? > [[ Sorry for the delay, I missed this all month ]] I generally prefer a MFC, unless the code is no longer in -current. Even if there's two different fixes for this logical bug, fixing it in current, then MFCing that (with the current hash) is fine, even if the stable/13 changes are completely different. For stable/13 I guess it matters a bit less than stable/14 since I'll be merging to it less, but if it's a commit from -current that doesn't need to be made to -stable because of the new commit on stable, I tend to include the MFC hash text. Warner --000000000000a7bbda062a47a309 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Mon, Dec 2, = 2024 at 2:41=E2=80=AFAM Zhenlei Huang <zlei@freebsd.org> wrote:
Hi Warner,

Recently I upgraded some ESXi vms from 13.3 to 13.4 and noticed weird repor= t for sas speed.
The boot console has the following,

```
da0 at pvscsi0 bus 0 scbus2 target 0 lun 0
da0: <VMware Virtual disk 2.0> Fixed Direct Access SPC-4 SCSI device<= br> da0: 4294967.295MB/s transfers
```
But camcontrol report the correct value,
```
# camcontrol inquiry da0 -R
pass1: 750.000MB/s transfers, Command Queueing Enabled
```

The `4294967.295MB` is actually 0xffff_ffff or -1 but I do not see any logi= c set those values.

Finally I managed to get the stack trace,
```
_scsi_announce_periph
scsi_announce_periph_sbuf
xpt_announce_periph_sbuf
dadone_proberc
xpt_done_process
xpt_done_td
fork_exit
fork_trampoline
```

and noticed that the last param `cts` of `_scsi_announce_periph(struct cam_= periph *periph, u_int *speed, u_int *freq, struct ccb_trans_settings *cts)`=
is from kernel stack and is not properly initialized, latter I found some c= ommits related to this,

076686fe0703 cam: make sure to clear CCBs allocated on the stack
ec5325dbca62 cam: make sure to clear even more CCBs allocated on the stack<= br> 0f206cc91279 cam: add missing zeroing of a stack-allocated CCB.
616a676a0535 cam: clear stack-allocated CCB in the target layer

I applied them to stable/13, rebuild and reboot, now the speed of da0 is re= ported correctly. I also tried to patch the pvscsi driver with few lines an= d
it also works as intended.

```
--- a/sys/dev/vmware/pvscsi/pvscsi.c
+++ b/sys/dev/vmware/pvscsi/pvscsi.c
@@ -1444,6 +1444,10 @@ pvscsi_action(struct cam_sim *sim, union ccb *ccb) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cts->proto_speci= fic.scsi.flags =3D CTS_SCSI_FLAGS_TAG_ENB;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cts->proto_speci= fic.scsi.valid =3D CTS_SCSI_VALID_TQ;

+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Prefer connectio= n speed over sas port speed */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* cts->xport_sp= ecific.sas.bitrate =3D 0; */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cts->xport_speci= fic.sas.valid =3D 0;
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ccb_h->status = =3D CAM_REQ_CMP;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 xpt_done(ccb);
```

Things come clear and I know why this weird speed happens, now it is time t= o decide how to fix it.

Fixing the consumer of cam, aka pvscsi driver, is quite simple and promisin= g. I did a quick search it appears other consumers set `cts->xport_speci= fic.sas.valid` correctly. It does not convince me as I'm quite new to c= am subsystem.

Yes. sas.valid is set whe= n the sas.bitrate and other data has been set correctly. Setting it to 0 li= ke this ensures it's ignored.=C2=A0 So if you know the speed, set sas.b= itrate to that speed and sas.valid to 1.

I'm n= ot sure I answered the question right, but if not, please clarify or point = out what I missed and I'll try again.
=C2=A0
Which one do you prefer, MFC commits to stable/13, or do direct fix for pvs= csi driver to stable/13 ?

[[ Sorry for = the delay, I missed this all month ]]

I generally = prefer a MFC, unless the code is no longer in -current. Even if there's= two different fixes for this logical bug, fixing it in current, then MFCin= g that (with the current hash) is fine, even if the stable/13 changes are c= ompletely different. For stable/13 I guess it matters a bit less than stabl= e/14 since I'll be merging to it less, but if it's a commit from -c= urrent that doesn't need to be made to -stable because of the new commi= t on stable, I tend to include the MFC hash text.

= Warner

--000000000000a7bbda062a47a309--