Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)

From: Kyle Evans <kevans_at_freebsd.org>
Date: Tue, 25 Apr 2023 22:09:28 UTC
On Tue, Apr 25, 2023 at 2:42 AM Souradeep Chakrabarti
<schakrabarti@microsoft.com> wrote:
>
>
>
>
> >-----Original Message-----
> >From: Souradeep Chakrabarti
> >Sent: Monday, April 24, 2023 6:24 PM
> >To: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>
> >Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org; dev-commits-
> >src-main@freebsd.org
> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
> >enablement for ARM64 in Hyper-V (Part 3, final)
> >
> >
> >
> >
> >>-----Original Message-----
> >>From: Kyle Evans <kevans@freebsd.org>
> >>Sent: Wednesday, April 19, 2023 11:00 AM
> >>To: Wei Hu <whu@freebsd.org>; Souradeep Chakrabarti
> >><schakrabarti@microsoft.com>
> >>Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org;
> >>dev-commits- src-main@freebsd.org
> >>Subject: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V:
> >>enablement for ARM64 in Hyper-V (Part 3, final)
> >>
> >>On Thu, Oct 27, 2022 at 8:54 AM Wei Hu <whu@freebsd.org> wrote:
> >>>
> >>> The branch main has been updated by whu:
> >>>
> >>> URL:
> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgi
> >>> t
> >>>
> >>.freebsd.org%2Fsrc%2Fcommit%2F%3Fid%3D9729f076e4d93c5a37e78d427bfe
> >0
> >>f1a
> >>>
> >>b99bbcc6&data=05%7C01%7Cschakrabarti%40microsoft.com%7C393e855f13c6
> >>49a
> >>>
> >>8883708db4097211e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
> >C
> >>6381747
> >>>
> >>90106786449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIj
> >oi
> >>V2luMz
> >>>
> >>IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=huiaOBSB2Z
> >>QDn5Q
> >>> cw%2FLWnmI%2FqbbriQd7HkYtpxnGLGE%3D&reserved=0
> >>>
> >>> commit 9729f076e4d93c5a37e78d427bfe0f1ab99bbcc6
> >>> Author:     Souradeep Chakrabarti <schakrabarti@microsoft.com>
> >>> AuthorDate: 2022-10-27 13:46:08 +0000
> >>> Commit:     Wei Hu <whu@FreeBSD.org>
> >>> CommitDate: 2022-10-27 13:53:22 +0000
> >>>
> >>>     arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)
> >>>
> >>>     This is the last part for ARM64 Hyper-V enablement. This includes
> >>>     commone files and make file changes to enable the ARM64 FreeBSD
> >>>     guest on Hyper-V. With this patch, it should be able to build
> >>>     the ARM64 image and install it on Hyper-V.
> >>>
> >>
> >>Hi,
> >>
> >>First off- thanks for doing this work! I can't seem to boot a -CURRENT
> >>image under Hyper-V on a Volterra machine, seemingly due to vmbus. It stalls right
> >after "vmbus:
> >>the irq 18" but before emitting a version number, I have some other
> >>comments here...
> >>
> >>> [... snip ...]
> >>> diff --git a/sys/dev/hyperv/vmbus/vmbus.c
> >>> b/sys/dev/hyperv/vmbus/vmbus.c index b0cd750b26c8..f370f2a75b99
> >>> 100644
> >>> --- a/sys/dev/hyperv/vmbus/vmbus.c
> >>> +++ b/sys/dev/hyperv/vmbus/vmbus.c
> >>> [... snip ...]
> >>> @@ -107,7 +113,7 @@ static uint32_t
> >>vmbus_get_vcpu_id_method(device_t bus,
> >>>                                     device_t dev, int cpu);
> >>>  static struct taskqueue                *vmbus_get_eventtq_method(device_t, device_t,
> >>>                                     int); -#ifdef EARLY_AP_STARTUP
> >>> +#if defined(EARLY_AP_STARTUP) || defined(__aarch64__)
> >>>  static void                    vmbus_intrhook(void *);
> >>>  #endif
> >>>
> >>
> >>My gut reaction to this is that this is a red flag. EARLY_AP_STARTUP
> >>implies characteristics that aarch64 doesn't exhibit; it's hard to see
> >>why this conditional is OK, or whether it's testing EARLY_AP_STARTUP as
> >>a bad way to write __i386__ || __amd64__.
> >[Souradeep] We had or'd the aarch64 here, as we wanted to defer the vmbus
> >attachment after gic and arm generic timer is attached. As we need PAUSE and
> >DELAY for vmbus and hyper-v drivers. For that we wanted to use
> >vmbus_intrhook().
> >>
> >>> [... snip ...]
> >>> @@ -760,7 +736,7 @@ vmbus_synic_setup(void *xsc)
> >>>
> >>>         if (hyperv_features & CPUID_HV_MSR_VP_INDEX) {
> >>>                 /* Save virtual processor id. */
> >>> -               VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX);
> >>> +               VMBUS_PCPU_GET(sc, vcpuid, cpu) =
> >>> + RDMSR(MSR_HV_VP_INDEX);
> >>>         } else {
> >>>                 /* Set virtual processor id to 0 for compatibility. */
> >>>                 VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0;
> >>
> >>This one, vmbus_synic_setup(), is invoked via vmbus_intrhook ->
> >>vmbus_doattach -
> >>> smp_rendezvous. On !EARLY_AP_STARTUP (e.g., aarch64), SMP isn't
> >>> functional in
> >>intrhooks and smp_rendezvous() will just call vmbus_synic_setup() on the boot
> >AP.
> >>There's nothing that will initialize the pcpu data on every other AP, AFAICT.
> >>
> >>That said, the !EARLY_AP_STARTUP path is also wrong. Quoting lines that
> >>weren't in this e-mail from vmbus_attach():
> >[Souradeep] Thanks! This is a really good catch, and yes, we are missing the smp
> >initialization during vmbus synic setup.
> >>
> >>1527 #else   /* !EARLY_AP_STARTUP */
> >>1528         /*
> >>1529          * If the system has already booted and thread
> >>1530          * scheduling is possible indicated by the global
> >>1531          * cold set to zero, we just call the driver
> >>1532          * initialization directly.
> >>1533          */
> >>1534         if (!cold)
> >>1535                 vmbus_doattach(vmbus_sc);
> >>1536 #endif /* EARLY_AP_STARTUP  and aarch64 */
> >>
> >>The two immediate issues I see is that in a device_attach, SMP won't be
> >>started. It's also going to be before SI_SUB_CONFIGURE, so `cold` will
> >>never be 0 here -- this is effectively dead code, and if it weren't
> >>dead code it would suffer the same problem as above where other APs
> >>aren't started yet so we won't set their pcpu data. This is OK, though,
> >>because then we have this sysinit later on that does the same thing for
> >>!EARLY_AP_STARTUP && !__aarch64__`, but that should really just be
> >>`!EARLY_AP_STARTUP` as aarch64 also needs to invoke
> >>vmbus_doattach() at SI_SUB_SMP (much later than SI_SUB_DRIVERS).
> >[Souradeep] I have tried to use the suggestion but in multi processor system the
> >boot is getting stuck at the end and "vmbus0: device scan, probe and attach done"
> >is not happening.
> >I am trying to figure out what is going wrong here, as the same change is working
> >with single cpu system. Any help or pointer will be really helpful.
> [Souradeep]  I can see the problem the task vmbus_scandone_task is getting enqueued but
> it is not getting executed.
>
> 525 static void
>  526 vmbus_scan_done(struct vmbus_softc *sc,
>  527     const struct vmbus_message *msg __unused)
>  528 {
>  529     device_printf(sc->vmbus_dev, " vmbus_scan_done is called\n");
>  530
>  531     taskqueue_enqueue(sc->vmbus_devtq, &sc->vmbus_scandone_task);
>  532 }
> vmbus scan done is getting called.
>
> But
> 514 static void
>  515 vmbus_scan_done_task(void *xsc, int pending __unused)
>  516 {
>  517     struct vmbus_softc *sc = xsc;
>  518     device_printf(sc->vmbus_dev, " vmbus_scan_done_task is called\n");
>  519     bus_topo_lock();
>  520     sc->vmbus_scandone = true;
>  521     bus_topo_unlock();
>  522     wakeup(&sc->vmbus_scandone);
>  523 }
> vmbus_scan_done_task  is not getting called the task .
>
> But the same is happening when single cpu is running, for multi cpu it is getting stuck.

Hi,

That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP
+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready
for scheduling until after smp_after_idle_runnable(), which is also at
SI_ORDER_ANY -- maybe there's just something going horribly wrong.
That would perhaps explain why it's fine on a single processor system,
which won't do anything useful (at least in later parts of
SI_SUB_SMP).

Thanks,

Kyle Evans