Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)

From: Kyle Evans <kevans_at_freebsd.org>
Date: Thu, 27 Apr 2023 00:23:35 UTC
On Wed, Apr 26, 2023 at 4:37 PM Souradeep Chakrabarti
<schakrabarti@microsoft.com> wrote:
>
>
>
>
> >-----Original Message-----
> >From: Souradeep Chakrabarti
> >Sent: Thursday, April 27, 2023 2:01 AM
> >To: 'Kyle Evans' <kevans@freebsd.org>
> >Cc: 'Wei Hu' <whu@freebsd.org>; 'src-committers@freebsd.org' <src-
> >committers@freebsd.org>; 'dev-commits-src-all@freebsd.org' <dev-commits-src-
> >all@freebsd.org>; 'dev-commits-src-main@freebsd.org' <dev-commits-src-
> >main@freebsd.org>
> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
> >enablement for ARM64 in Hyper-V (Part 3, final)
> >
> >
> >
> >
> >>-----Original Message-----
> >>From: Souradeep Chakrabarti
> >>Sent: Wednesday, April 26, 2023 7:26 PM
> >>To: Kyle Evans <kevans@freebsd.org>
> >>Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org;
> >>dev-commits-src- all@freebsd.org; dev-commits-src-main@freebsd.org
> >>Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
> >>enablement for ARM64 in Hyper-V (Part 3, final)
> >>
> >>
> >>
> >>
> >>>-----Original Message-----
> >>>From: Kyle Evans <kevans@freebsd.org>
> >>>Sent: Wednesday, April 26, 2023 3:39 AM
> >>>To: Souradeep Chakrabarti <schakrabarti@microsoft.com>
> >>>Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src-
> >>>committers@freebsd.org; dev-commits-src-all@freebsd.org;
> >>>dev-commits-src- main@freebsd.org
> >>>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
> >>>enablement for ARM64 in Hyper-V (Part 3, final)
> [... snip ...]
> >>>Hi,
> >>>
> >>>That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP
> >>>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready
> >>>for scheduling until after smp_after_idle_runnable(), which is also at
> >>>SI_ORDER_ANY
> >>>-- maybe there's just something going horribly wrong.
> >>>That would perhaps explain why it's fine on a single processor system,
> >>>which won't do anything useful (at least in later parts of SI_SUB_SMP).
> >>[Souradeep]
> >>In ARM64 SMP(VM with two cpu),  storvsc attach is happening two times
> >>for single scsi controller.
> >>But for intel similar VM (two cpu), it is happening once.
> >>For the dummy/fake storvsc in arm64, we are getting stuck at device_attach.
> >>
> >>Details:
> >>
> >>vmbus_scan_done(), not getting invoked because vmbus_add_child() is not
> >>complete for a channel 15, because of which vmbus_devtq is having one
> >>task pending.
> >>
> >>Now
> >>By passing NMI in the hung system, after examining all threads:
> >>
> >>sched_switch() at sched_switch+0x4dc
> >>mi_switch() at mi_switch+0x194
> >>sleepq_switch() at sleepq_switch+0xfc
> >>_cv_wait() at _cv_wait+0x160
> >>_sema_wait() at _sema_wait+0x50
> >>storvsc_attach() at storvsc_attach+0x610
> >>device_attach() at device_attach+0x3f8
> >>device_probe_and_attach() at device_probe_and_attach+0x7c
> >>vmbus_add_child() at vmbus_add_child+0x64
> >>
> >>Now ,
> >>
> >>It is stuck at waiting on sema_wait() on request->synch_sema in
> >>hv_storvsc_channel_init() because
> >>sema_post() on request->synch_sema is not getting invoked. Which unlocks it.
> >>This is because we are waiting on sema_wait on synch_sema
> >>hv_storvsc_channel_init(), for storvsc1 , but there is no storvsc1
> >>device. So not getting a callback called for storvsc1.
> >>
> >>From ARM64 debug log:
> >>If you see at line 545 again SCI device got detected.
> >>
> >>      Line  370: storvsc0: Enlightened SCSI device detected
> >>      Line  371: storvsc0: <Hyper-V SCSI> on vmbus0
> >>      Line  406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6
> >>      Line  421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
> >>      Line  436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
> >>      Line  443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3
> >>SCSI device
> >>      Line  447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3
> >SCSI
> >>device
> >>      Line  545: storvsc1: Enlightened SCSI device detected
> >>      Line  547: storvsc1: Enlightened SCSI device detected
> >>      Line  549: storvsc1: <Hyper-V SCSI>hv_storvsc_on_channel_callback is
> >>called
> >>
> >>From Log:
> >>
> >>unknown: device_add_child for chan15
> >>storvsc1: Enlightened SCSI device detected
> >>storvsc1: Enlightened SCSI device detected
> >>storvsc1: <Hyper-V SCSI> on vmbus0
> >>storvsc ringbuffer size: 262144, max_io: 512
> >>storvsc1: chan15 assigned to cpu1 [vcpu1]
> >>hn0: link state changed to UP
> >>vmbus0: vmbus_chanmsg_handle type 0xa
> >>storvsc1: gpadl_conn(chan15) succeeded
> >>vmbus0: vmbus_chanmsg_handle type 0x6
> >>storvsc1: chan15 opened
> >>waiting on sema wait synch_sema hv_storvsc_channel_init
> >[Souradeep] The fix is working, the test bed had an issue, after fixing that the fix is
> >working.
> >I will share the fix by this week.
> [Souradeep] Small update, the problem happens only if there is an extra
> SCSI controller on the system. Then it fails to attach storvsc for that SCSI.
>

Excellent! Knowing now what configuration causes it; does it reproduce
on x86 as well with an extra SCSI controller? I'd expect so, but maybe
not-

Thanks,

Kyle Evans