RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)
- Reply: Kyle Evans : "Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)"
- In reply to: Souradeep Chakrabarti : "RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 26 Apr 2023 21:37:44 UTC
>-----Original Message----- >From: Souradeep Chakrabarti >Sent: Thursday, April 27, 2023 2:01 AM >To: 'Kyle Evans' <kevans@freebsd.org> >Cc: 'Wei Hu' <whu@freebsd.org>; 'src-committers@freebsd.org' <src- >committers@freebsd.org>; 'dev-commits-src-all@freebsd.org' <dev-commits-src- >all@freebsd.org>; 'dev-commits-src-main@freebsd.org' <dev-commits-src- >main@freebsd.org> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >enablement for ARM64 in Hyper-V (Part 3, final) > > > > >>-----Original Message----- >>From: Souradeep Chakrabarti >>Sent: Wednesday, April 26, 2023 7:26 PM >>To: Kyle Evans <kevans@freebsd.org> >>Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org; >>dev-commits-src- all@freebsd.org; dev-commits-src-main@freebsd.org >>Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >>enablement for ARM64 in Hyper-V (Part 3, final) >> >> >> >> >>>-----Original Message----- >>>From: Kyle Evans <kevans@freebsd.org> >>>Sent: Wednesday, April 26, 2023 3:39 AM >>>To: Souradeep Chakrabarti <schakrabarti@microsoft.com> >>>Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src- >>>committers@freebsd.org; dev-commits-src-all@freebsd.org; >>>dev-commits-src- main@freebsd.org >>>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >>>enablement for ARM64 in Hyper-V (Part 3, final) >>> >>>On Tue, Apr 25, 2023 at 2:42 AM Souradeep Chakrabarti >>><schakrabarti@microsoft.com> wrote: >>>> >>>> >>>> >>>> >>>> >-----Original Message----- >>>> >From: Souradeep Chakrabarti >>>> >Sent: Monday, April 24, 2023 6:24 PM >>>> >To: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org> >>>> >Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org; >>>> >dev-commits- src-main@freebsd.org >>>> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >>>> >enablement for ARM64 in Hyper-V (Part 3, final) >>>> > >>>> > >>>> > >>>> > >>>> >>-----Original Message----- >>>> >>From: Kyle Evans <kevans@freebsd.org> >>>> >>Sent: Wednesday, April 19, 2023 11:00 AM >>>> >>To: Wei Hu <whu@freebsd.org>; Souradeep Chakrabarti >>>> >><schakrabarti@microsoft.com> >>>> >>Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org; >>>> >>dev-commits- src-main@freebsd.org >>>> >>Subject: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >>>> >>enablement for ARM64 in Hyper-V (Part 3, final) >>>> >> >>>> >>On Thu, Oct 27, 2022 at 8:54 AM Wei Hu <whu@freebsd.org> wrote: >>>> >>> >>>> >>> The branch main has been updated by whu: >>>> >>> >>>> >>> URL: >>>> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F% >>>> >>> 2 >>>> >>> F >>>> >>> >>>cgi%2F&data=05%7C01%7Cschakrabarti%40microsoft.com%7Ce1cfc3aff7f94 >>>> >>> >>>d5c2ca408db45d9c4c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0 >% >>7 >>>C6 >>>> >>> >>>38180573867118793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM >D >>Ai >>>LCJQ >>>> >>> >>>IjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata= >k >>>> >>> >>>MntMYzegWgF%2B90vq6FERBLjwfM%2FVb1GUNwUP9pv90Q%3D&reserved= >0 >>>> >>> t >>>> >>> >>>> >>>>>.freebsd.org%2Fsrc%2Fcommit%2F%3Fid%3D9729f076e4d93c5a37e78d427 >b >>f >>>e >>>> >0 >>>> >>f1a >>>> >>> >>>> >>>>>b99bbcc6&data=05%7C01%7Cschakrabarti%40microsoft.com%7C393e855f1 >3 >>c >>>6 >>>> >>49a >>>> >>> >>>> >>>>>8883708db4097211e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0 >>% >>>7 >>>> >C >>>> >>6381747 >>>> >>> >>>> >>>>>90106786449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJ >>QI >>>j >>>> >oi >>>> >>V2luMz >>>> >>> >>>> >>>>>IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=huiaOBS >B >>2 >>>Z >>>> >>QDn5Q >>>> >>> cw%2FLWnmI%2FqbbriQd7HkYtpxnGLGE%3D&reserved=0 >>>> >>> >>>> >>> commit 9729f076e4d93c5a37e78d427bfe0f1ab99bbcc6 >>>> >>> Author: Souradeep Chakrabarti <schakrabarti@microsoft.com> >>>> >>> AuthorDate: 2022-10-27 13:46:08 +0000 >>>> >>> Commit: Wei Hu <whu@FreeBSD.org> >>>> >>> CommitDate: 2022-10-27 13:53:22 +0000 >>>> >>> >>>> >>> arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, >>>> >>> final) >>>> >>> >>>> >>> This is the last part for ARM64 Hyper-V enablement. This includes >>>> >>> commone files and make file changes to enable the ARM64 FreeBSD >>>> >>> guest on Hyper-V. With this patch, it should be able to build >>>> >>> the ARM64 image and install it on Hyper-V. >>>> >>> >>>> >> >>>> >>Hi, >>>> >> >>>> >>First off- thanks for doing this work! I can't seem to boot a >>>> >>-CURRENT image under Hyper-V on a Volterra machine, seemingly due >>>> >>to vmbus. It stalls right >>>> >after "vmbus: >>>> >>the irq 18" but before emitting a version number, I have some >>>> >>other comments here... >>>> >> >>>> >>> [... snip ...] >>>> >>> diff --git a/sys/dev/hyperv/vmbus/vmbus.c >>>> >>> b/sys/dev/hyperv/vmbus/vmbus.c index b0cd750b26c8..f370f2a75b99 >>>> >>> 100644 >>>> >>> --- a/sys/dev/hyperv/vmbus/vmbus.c >>>> >>> +++ b/sys/dev/hyperv/vmbus/vmbus.c >>>> >>> [... snip ...] >>>> >>> @@ -107,7 +113,7 @@ static uint32_t >>>> >>vmbus_get_vcpu_id_method(device_t bus, >>>> >>> device_t dev, int cpu); >>>> >>> static struct taskqueue *vmbus_get_eventtq_method(device_t, >>>device_t, >>>> >>> int); -#ifdef >>>> >>> EARLY_AP_STARTUP >>>> >>> +#if defined(EARLY_AP_STARTUP) || defined(__aarch64__) >>>> >>> static void vmbus_intrhook(void *); >>>> >>> #endif >>>> >>> >>>> >> >>>> >>My gut reaction to this is that this is a red flag. >>>> >>EARLY_AP_STARTUP implies characteristics that aarch64 doesn't >>>> >>exhibit; it's hard to see why this conditional is OK, or whether >>>> >>it's testing EARLY_AP_STARTUP as a bad way to write __i386__ || >__amd64__. >>>> >[Souradeep] We had or'd the aarch64 here, as we wanted to defer the >>>> >vmbus attachment after gic and arm generic timer is attached. As we >>>> >need PAUSE and DELAY for vmbus and hyper-v drivers. For that we >>>> >wanted to use vmbus_intrhook(). >>>> >> >>>> >>> [... snip ...] >>>> >>> @@ -760,7 +736,7 @@ vmbus_synic_setup(void *xsc) >>>> >>> >>>> >>> if (hyperv_features & CPUID_HV_MSR_VP_INDEX) { >>>> >>> /* Save virtual processor id. */ >>>> >>> - VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX); >>>> >>> + VMBUS_PCPU_GET(sc, vcpuid, cpu) = >>>> >>> + RDMSR(MSR_HV_VP_INDEX); >>>> >>> } else { >>>> >>> /* Set virtual processor id to 0 for compatibility. */ >>>> >>> VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0; >>>> >> >>>> >>This one, vmbus_synic_setup(), is invoked via vmbus_intrhook -> >>>> >>vmbus_doattach - >>>> >>> smp_rendezvous. On !EARLY_AP_STARTUP (e.g., aarch64), SMP isn't >>>> >>> functional in >>>> >>intrhooks and smp_rendezvous() will just call vmbus_synic_setup() >>>> >>on the boot >>>> >AP. >>>> >>There's nothing that will initialize the pcpu data on every other AP, AFAICT. >>>> >> >>>> >>That said, the !EARLY_AP_STARTUP path is also wrong. Quoting lines >>>> >>that weren't in this e-mail from vmbus_attach(): >>>> >[Souradeep] Thanks! This is a really good catch, and yes, we are >>>> >missing the smp initialization during vmbus synic setup. >>>> >> >>>> >>1527 #else /* !EARLY_AP_STARTUP */ >>>> >>1528 /* >>>> >>1529 * If the system has already booted and thread >>>> >>1530 * scheduling is possible indicated by the global >>>> >>1531 * cold set to zero, we just call the driver >>>> >>1532 * initialization directly. >>>> >>1533 */ >>>> >>1534 if (!cold) >>>> >>1535 vmbus_doattach(vmbus_sc); >>>> >>1536 #endif /* EARLY_AP_STARTUP and aarch64 */ >>>> >> >>>> >>The two immediate issues I see is that in a device_attach, SMP >>>> >>won't be started. It's also going to be before SI_SUB_CONFIGURE, >>>> >>so `cold` will never be 0 here -- this is effectively dead code, >>>> >>and if it weren't dead code it would suffer the same problem as >>>> >>above where other APs aren't started yet so we won't set their pcpu data. >>>> >>This is OK, though, because then we have this sysinit later on >>>> >>that does the same thing for !EARLY_AP_STARTUP && !__aarch64__`, >>>> >>but that should really just be `!EARLY_AP_STARTUP` as aarch64 also >>>> >>needs to invoke >>>> >>vmbus_doattach() at SI_SUB_SMP (much later than SI_SUB_DRIVERS). >>>> >[Souradeep] I have tried to use the suggestion but in multi >>>> >processor system the boot is getting stuck at the end and "vmbus0: >>>> >device scan, probe and >>>attach done" >>>> >is not happening. >>>> >I am trying to figure out what is going wrong here, as the same >>>> >change is working with single cpu system. Any help or pointer will >>>> >be really >>>helpful. >>>> [Souradeep] I can see the problem the task vmbus_scandone_task is >>>> getting enqueued but it is not getting executed. >>>> >>>> 525 static void >>>> 526 vmbus_scan_done(struct vmbus_softc *sc, >>>> 527 const struct vmbus_message *msg __unused) >>>> 528 { >>>> 529 device_printf(sc->vmbus_dev, " vmbus_scan_done is called\n"); >>>> 530 >>>> 531 taskqueue_enqueue(sc->vmbus_devtq, &sc->vmbus_scandone_task); >>>> 532 } >>>> vmbus scan done is getting called. >>>> >>>> But >>>> 514 static void >>>> 515 vmbus_scan_done_task(void *xsc, int pending __unused) >>>> 516 { >>>> 517 struct vmbus_softc *sc = xsc; >>>> 518 device_printf(sc->vmbus_dev, " vmbus_scan_done_task is called\n"); >>>> 519 bus_topo_lock(); >>>> 520 sc->vmbus_scandone = true; >>>> 521 bus_topo_unlock(); >>>> 522 wakeup(&sc->vmbus_scandone); >>>> 523 } >>>> vmbus_scan_done_task is not getting called the task . >>>> >>>> But the same is happening when single cpu is running, for multi cpu >>>> it is getting >>>stuck. >>> >>>Hi, >>> >>>That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP >>>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready >>>for scheduling until after smp_after_idle_runnable(), which is also at >>>SI_ORDER_ANY >>>-- maybe there's just something going horribly wrong. >>>That would perhaps explain why it's fine on a single processor system, >>>which won't do anything useful (at least in later parts of SI_SUB_SMP). >>[Souradeep] >>In ARM64 SMP(VM with two cpu), storvsc attach is happening two times >>for single scsi controller. >>But for intel similar VM (two cpu), it is happening once. >>For the dummy/fake storvsc in arm64, we are getting stuck at device_attach. >> >>Details: >> >>vmbus_scan_done(), not getting invoked because vmbus_add_child() is not >>complete for a channel 15, because of which vmbus_devtq is having one >>task pending. >> >>Now >>By passing NMI in the hung system, after examining all threads: >> >>sched_switch() at sched_switch+0x4dc >>mi_switch() at mi_switch+0x194 >>sleepq_switch() at sleepq_switch+0xfc >>_cv_wait() at _cv_wait+0x160 >>_sema_wait() at _sema_wait+0x50 >>storvsc_attach() at storvsc_attach+0x610 >>device_attach() at device_attach+0x3f8 >>device_probe_and_attach() at device_probe_and_attach+0x7c >>vmbus_add_child() at vmbus_add_child+0x64 >> >>Now , >> >>It is stuck at waiting on sema_wait() on request->synch_sema in >>hv_storvsc_channel_init() because >>sema_post() on request->synch_sema is not getting invoked. Which unlocks it. >>This is because we are waiting on sema_wait on synch_sema >>hv_storvsc_channel_init(), for storvsc1 , but there is no storvsc1 >>device. So not getting a callback called for storvsc1. >> >>From ARM64 debug log: >>If you see at line 545 again SCI device got detected. >> >> Line 370: storvsc0: Enlightened SCSI device detected >> Line 371: storvsc0: <Hyper-V SCSI> on vmbus0 >> Line 406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6 >> Line 421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device >> Line 436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device >> Line 443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 >>SCSI device >> Line 447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 >SCSI >>device >> Line 545: storvsc1: Enlightened SCSI device detected >> Line 547: storvsc1: Enlightened SCSI device detected >> Line 549: storvsc1: <Hyper-V SCSI>hv_storvsc_on_channel_callback is >>called >> >>From Log: >> >>unknown: device_add_child for chan15 >>storvsc1: Enlightened SCSI device detected >>storvsc1: Enlightened SCSI device detected >>storvsc1: <Hyper-V SCSI> on vmbus0 >>storvsc ringbuffer size: 262144, max_io: 512 >>storvsc1: chan15 assigned to cpu1 [vcpu1] >>hn0: link state changed to UP >>vmbus0: vmbus_chanmsg_handle type 0xa >>storvsc1: gpadl_conn(chan15) succeeded >>vmbus0: vmbus_chanmsg_handle type 0x6 >>storvsc1: chan15 opened >>waiting on sema wait synch_sema hv_storvsc_channel_init >[Souradeep] The fix is working, the test bed had an issue, after fixing that the fix is >working. >I will share the fix by this week. [Souradeep] Small update, the problem happens only if there is an extra SCSI controller on the system. Then it fails to attach storvsc for that SCSI. >>> >>>Thanks, >>> >>>Kyle Evans