RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)

From: Souradeep Chakrabarti <schakrabarti_at_microsoft.com>
Date: Wed, 26 Apr 2023 13:56:01 UTC


>-----Original Message-----
>From: Kyle Evans <kevans@freebsd.org>
>Sent: Wednesday, April 26, 2023 3:39 AM
>To: Souradeep Chakrabarti <schakrabarti@microsoft.com>
>Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src-
>committers@freebsd.org; dev-commits-src-all@freebsd.org; dev-commits-src-
>main@freebsd.org
>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>enablement for ARM64 in Hyper-V (Part 3, final)
>
>On Tue, Apr 25, 2023 at 2:42 AM Souradeep Chakrabarti
><schakrabarti@microsoft.com> wrote:
>>
>>
>>
>>
>> >-----Original Message-----
>> >From: Souradeep Chakrabarti
>> >Sent: Monday, April 24, 2023 6:24 PM
>> >To: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>
>> >Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org;
>> >dev-commits- src-main@freebsd.org
>> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >enablement for ARM64 in Hyper-V (Part 3, final)
>> >
>> >
>> >
>> >
>> >>-----Original Message-----
>> >>From: Kyle Evans <kevans@freebsd.org>
>> >>Sent: Wednesday, April 19, 2023 11:00 AM
>> >>To: Wei Hu <whu@freebsd.org>; Souradeep Chakrabarti
>> >><schakrabarti@microsoft.com>
>> >>Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org;
>> >>dev-commits- src-main@freebsd.org
>> >>Subject: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
>> >>enablement for ARM64 in Hyper-V (Part 3, final)
>> >>
>> >>On Thu, Oct 27, 2022 at 8:54 AM Wei Hu <whu@freebsd.org> wrote:
>> >>>
>> >>> The branch main has been updated by whu:
>> >>>
>> >>> URL:
>> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> >>>
>cgi%2F&data=05%7C01%7Cschakrabarti%40microsoft.com%7Ce1cfc3aff7f94
>> >>>
>d5c2ca408db45d9c4c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
>C6
>> >>>
>38180573867118793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
>LCJQ
>> >>>
>IjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=k
>> >>>
>MntMYzegWgF%2B90vq6FERBLjwfM%2FVb1GUNwUP9pv90Q%3D&reserved=0
>> >>> t
>> >>>
>>
>>>.freebsd.org%2Fsrc%2Fcommit%2F%3Fid%3D9729f076e4d93c5a37e78d427bf
>e
>> >0
>> >>f1a
>> >>>
>>
>>>b99bbcc6&data=05%7C01%7Cschakrabarti%40microsoft.com%7C393e855f13c
>6
>> >>49a
>> >>>
>>
>>>8883708db4097211e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
>7
>> >C
>> >>6381747
>> >>>
>>
>>>90106786449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
>j
>> >oi
>> >>V2luMz
>> >>>
>>
>>>IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=huiaOBSB2
>Z
>> >>QDn5Q
>> >>> cw%2FLWnmI%2FqbbriQd7HkYtpxnGLGE%3D&reserved=0
>> >>>
>> >>> commit 9729f076e4d93c5a37e78d427bfe0f1ab99bbcc6
>> >>> Author:     Souradeep Chakrabarti <schakrabarti@microsoft.com>
>> >>> AuthorDate: 2022-10-27 13:46:08 +0000
>> >>> Commit:     Wei Hu <whu@FreeBSD.org>
>> >>> CommitDate: 2022-10-27 13:53:22 +0000
>> >>>
>> >>>     arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3,
>> >>> final)
>> >>>
>> >>>     This is the last part for ARM64 Hyper-V enablement. This includes
>> >>>     commone files and make file changes to enable the ARM64 FreeBSD
>> >>>     guest on Hyper-V. With this patch, it should be able to build
>> >>>     the ARM64 image and install it on Hyper-V.
>> >>>
>> >>
>> >>Hi,
>> >>
>> >>First off- thanks for doing this work! I can't seem to boot a
>> >>-CURRENT image under Hyper-V on a Volterra machine, seemingly due to
>> >>vmbus. It stalls right
>> >after "vmbus:
>> >>the irq 18" but before emitting a version number, I have some other
>> >>comments here...
>> >>
>> >>> [... snip ...]
>> >>> diff --git a/sys/dev/hyperv/vmbus/vmbus.c
>> >>> b/sys/dev/hyperv/vmbus/vmbus.c index b0cd750b26c8..f370f2a75b99
>> >>> 100644
>> >>> --- a/sys/dev/hyperv/vmbus/vmbus.c
>> >>> +++ b/sys/dev/hyperv/vmbus/vmbus.c
>> >>> [... snip ...]
>> >>> @@ -107,7 +113,7 @@ static uint32_t
>> >>vmbus_get_vcpu_id_method(device_t bus,
>> >>>                                     device_t dev, int cpu);
>> >>>  static struct taskqueue                *vmbus_get_eventtq_method(device_t,
>device_t,
>> >>>                                     int); -#ifdef EARLY_AP_STARTUP
>> >>> +#if defined(EARLY_AP_STARTUP) || defined(__aarch64__)
>> >>>  static void                    vmbus_intrhook(void *);
>> >>>  #endif
>> >>>
>> >>
>> >>My gut reaction to this is that this is a red flag. EARLY_AP_STARTUP
>> >>implies characteristics that aarch64 doesn't exhibit; it's hard to
>> >>see why this conditional is OK, or whether it's testing
>> >>EARLY_AP_STARTUP as a bad way to write __i386__ || __amd64__.
>> >[Souradeep] We had or'd the aarch64 here, as we wanted to defer the
>> >vmbus attachment after gic and arm generic timer is attached. As we
>> >need PAUSE and DELAY for vmbus and hyper-v drivers. For that we
>> >wanted to use vmbus_intrhook().
>> >>
>> >>> [... snip ...]
>> >>> @@ -760,7 +736,7 @@ vmbus_synic_setup(void *xsc)
>> >>>
>> >>>         if (hyperv_features & CPUID_HV_MSR_VP_INDEX) {
>> >>>                 /* Save virtual processor id. */
>> >>> -               VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX);
>> >>> +               VMBUS_PCPU_GET(sc, vcpuid, cpu) =
>> >>> + RDMSR(MSR_HV_VP_INDEX);
>> >>>         } else {
>> >>>                 /* Set virtual processor id to 0 for compatibility. */
>> >>>                 VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0;
>> >>
>> >>This one, vmbus_synic_setup(), is invoked via vmbus_intrhook ->
>> >>vmbus_doattach -
>> >>> smp_rendezvous. On !EARLY_AP_STARTUP (e.g., aarch64), SMP isn't
>> >>> functional in
>> >>intrhooks and smp_rendezvous() will just call vmbus_synic_setup() on
>> >>the boot
>> >AP.
>> >>There's nothing that will initialize the pcpu data on every other AP, AFAICT.
>> >>
>> >>That said, the !EARLY_AP_STARTUP path is also wrong. Quoting lines
>> >>that weren't in this e-mail from vmbus_attach():
>> >[Souradeep] Thanks! This is a really good catch, and yes, we are
>> >missing the smp initialization during vmbus synic setup.
>> >>
>> >>1527 #else   /* !EARLY_AP_STARTUP */
>> >>1528         /*
>> >>1529          * If the system has already booted and thread
>> >>1530          * scheduling is possible indicated by the global
>> >>1531          * cold set to zero, we just call the driver
>> >>1532          * initialization directly.
>> >>1533          */
>> >>1534         if (!cold)
>> >>1535                 vmbus_doattach(vmbus_sc);
>> >>1536 #endif /* EARLY_AP_STARTUP  and aarch64 */
>> >>
>> >>The two immediate issues I see is that in a device_attach, SMP won't
>> >>be started. It's also going to be before SI_SUB_CONFIGURE, so `cold`
>> >>will never be 0 here -- this is effectively dead code, and if it
>> >>weren't dead code it would suffer the same problem as above where
>> >>other APs aren't started yet so we won't set their pcpu data. This
>> >>is OK, though, because then we have this sysinit later on that does
>> >>the same thing for !EARLY_AP_STARTUP && !__aarch64__`, but that
>> >>should really just be `!EARLY_AP_STARTUP` as aarch64 also needs to
>> >>invoke
>> >>vmbus_doattach() at SI_SUB_SMP (much later than SI_SUB_DRIVERS).
>> >[Souradeep] I have tried to use the suggestion but in multi processor
>> >system the boot is getting stuck at the end and "vmbus0: device scan, probe and
>attach done"
>> >is not happening.
>> >I am trying to figure out what is going wrong here, as the same
>> >change is working with single cpu system. Any help or pointer will be really
>helpful.
>> [Souradeep]  I can see the problem the task vmbus_scandone_task is
>> getting enqueued but it is not getting executed.
>>
>> 525 static void
>>  526 vmbus_scan_done(struct vmbus_softc *sc,
>>  527     const struct vmbus_message *msg __unused)
>>  528 {
>>  529     device_printf(sc->vmbus_dev, " vmbus_scan_done is called\n");
>>  530
>>  531     taskqueue_enqueue(sc->vmbus_devtq, &sc->vmbus_scandone_task);
>>  532 }
>> vmbus scan done is getting called.
>>
>> But
>> 514 static void
>>  515 vmbus_scan_done_task(void *xsc, int pending __unused)
>>  516 {
>>  517     struct vmbus_softc *sc = xsc;
>>  518     device_printf(sc->vmbus_dev, " vmbus_scan_done_task is called\n");
>>  519     bus_topo_lock();
>>  520     sc->vmbus_scandone = true;
>>  521     bus_topo_unlock();
>>  522     wakeup(&sc->vmbus_scandone);
>>  523 }
>> vmbus_scan_done_task  is not getting called the task .
>>
>> But the same is happening when single cpu is running, for multi cpu it is getting
>stuck.
>
>Hi,
>
>That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP
>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready
>for scheduling until after smp_after_idle_runnable(), which is also at SI_ORDER_ANY
>-- maybe there's just something going horribly wrong.
>That would perhaps explain why it's fine on a single processor system, which won't
>do anything useful (at least in later parts of SI_SUB_SMP).
[Souradeep] 
In ARM64 SMP(VM with two cpu),  storvsc attach is happening two times for single scsi controller. 
But for intel similar VM (two cpu), it is happening once.
For the dummy/fake storvsc in arm64, we are getting stuck at device_attach.

Details:

vmbus_scan_done(), not getting invoked because vmbus_add_child() is not complete for a channel 15, 
because of which vmbus_devtq is having one task pending.

Now
By passing NMI in the hung system, after examining all threads:

sched_switch() at sched_switch+0x4dc
mi_switch() at mi_switch+0x194
sleepq_switch() at sleepq_switch+0xfc
_cv_wait() at _cv_wait+0x160
_sema_wait() at _sema_wait+0x50
storvsc_attach() at storvsc_attach+0x610
device_attach() at device_attach+0x3f8
device_probe_and_attach() at device_probe_and_attach+0x7c
vmbus_add_child() at vmbus_add_child+0x64

Now ,

It is stuck at waiting on sema_wait() on request->synch_sema in hv_storvsc_channel_init() because
sema_post() on request->synch_sema is not getting invoked. Which unlocks it.
This is because we are waiting on sema_wait on synch_sema hv_storvsc_channel_init(), for storvsc1 , 
but there is no storvsc1 device. So not getting a callback called for storvsc1.

From ARM64 debug log:
If you see at line 545 again SCI device got detected.

	Line  370: storvsc0: Enlightened SCSI device detected
	Line  371: storvsc0: <Hyper-V SCSI> on vmbus0
	Line  406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6
	Line  421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
	Line  436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
	Line  443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI device
	Line  447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI device
	Line  545: storvsc1: Enlightened SCSI device detected
	Line  547: storvsc1: Enlightened SCSI device detected
	Line  549: storvsc1: <Hyper-V SCSI>hv_storvsc_on_channel_callback is called

From Log:

unknown: device_add_child for chan15
storvsc1: Enlightened SCSI device detected
storvsc1: Enlightened SCSI device detected
storvsc1: <Hyper-V SCSI> on vmbus0
storvsc ringbuffer size: 262144, max_io: 512
storvsc1: chan15 assigned to cpu1 [vcpu1]
hn0: link state changed to UP
vmbus0: vmbus_chanmsg_handle type 0xa
storvsc1: gpadl_conn(chan15) succeeded
vmbus0: vmbus_chanmsg_handle type 0x6
storvsc1: chan15 opened
waiting on sema wait synch_sema hv_storvsc_channel_init
>
>Thanks,
>
>Kyle Evans