RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)
- Reply: Souradeep Chakrabarti : "RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)"
- In reply to: Kyle Evans : "Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 26 Apr 2023 13:56:01 UTC
>-----Original Message----- >From: Kyle Evans <kevans@freebsd.org> >Sent: Wednesday, April 26, 2023 3:39 AM >To: Souradeep Chakrabarti <schakrabarti@microsoft.com> >Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src- >committers@freebsd.org; dev-commits-src-all@freebsd.org; dev-commits-src- >main@freebsd.org >Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >enablement for ARM64 in Hyper-V (Part 3, final) > >On Tue, Apr 25, 2023 at 2:42 AM Souradeep Chakrabarti ><schakrabarti@microsoft.com> wrote: >> >> >> >> >> >-----Original Message----- >> >From: Souradeep Chakrabarti >> >Sent: Monday, April 24, 2023 6:24 PM >> >To: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org> >> >Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org; >> >dev-commits- src-main@freebsd.org >> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >> >enablement for ARM64 in Hyper-V (Part 3, final) >> > >> > >> > >> > >> >>-----Original Message----- >> >>From: Kyle Evans <kevans@freebsd.org> >> >>Sent: Wednesday, April 19, 2023 11:00 AM >> >>To: Wei Hu <whu@freebsd.org>; Souradeep Chakrabarti >> >><schakrabarti@microsoft.com> >> >>Cc: src-committers@freebsd.org; dev-commits-src-all@freebsd.org; >> >>dev-commits- src-main@freebsd.org >> >>Subject: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >> >>enablement for ARM64 in Hyper-V (Part 3, final) >> >> >> >>On Thu, Oct 27, 2022 at 8:54 AM Wei Hu <whu@freebsd.org> wrote: >> >>> >> >>> The branch main has been updated by whu: >> >>> >> >>> URL: >> >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F >> >>> >cgi%2F&data=05%7C01%7Cschakrabarti%40microsoft.com%7Ce1cfc3aff7f94 >> >>> >d5c2ca408db45d9c4c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7 >C6 >> >>> >38180573867118793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi >LCJQ >> >>> >IjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=k >> >>> >MntMYzegWgF%2B90vq6FERBLjwfM%2FVb1GUNwUP9pv90Q%3D&reserved=0 >> >>> t >> >>> >> >>>.freebsd.org%2Fsrc%2Fcommit%2F%3Fid%3D9729f076e4d93c5a37e78d427bf >e >> >0 >> >>f1a >> >>> >> >>>b99bbcc6&data=05%7C01%7Cschakrabarti%40microsoft.com%7C393e855f13c >6 >> >>49a >> >>> >> >>>8883708db4097211e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0% >7 >> >C >> >>6381747 >> >>> >> >>>90106786449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI >j >> >oi >> >>V2luMz >> >>> >> >>>IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=huiaOBSB2 >Z >> >>QDn5Q >> >>> cw%2FLWnmI%2FqbbriQd7HkYtpxnGLGE%3D&reserved=0 >> >>> >> >>> commit 9729f076e4d93c5a37e78d427bfe0f1ab99bbcc6 >> >>> Author: Souradeep Chakrabarti <schakrabarti@microsoft.com> >> >>> AuthorDate: 2022-10-27 13:46:08 +0000 >> >>> Commit: Wei Hu <whu@FreeBSD.org> >> >>> CommitDate: 2022-10-27 13:53:22 +0000 >> >>> >> >>> arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, >> >>> final) >> >>> >> >>> This is the last part for ARM64 Hyper-V enablement. This includes >> >>> commone files and make file changes to enable the ARM64 FreeBSD >> >>> guest on Hyper-V. With this patch, it should be able to build >> >>> the ARM64 image and install it on Hyper-V. >> >>> >> >> >> >>Hi, >> >> >> >>First off- thanks for doing this work! I can't seem to boot a >> >>-CURRENT image under Hyper-V on a Volterra machine, seemingly due to >> >>vmbus. It stalls right >> >after "vmbus: >> >>the irq 18" but before emitting a version number, I have some other >> >>comments here... >> >> >> >>> [... snip ...] >> >>> diff --git a/sys/dev/hyperv/vmbus/vmbus.c >> >>> b/sys/dev/hyperv/vmbus/vmbus.c index b0cd750b26c8..f370f2a75b99 >> >>> 100644 >> >>> --- a/sys/dev/hyperv/vmbus/vmbus.c >> >>> +++ b/sys/dev/hyperv/vmbus/vmbus.c >> >>> [... snip ...] >> >>> @@ -107,7 +113,7 @@ static uint32_t >> >>vmbus_get_vcpu_id_method(device_t bus, >> >>> device_t dev, int cpu); >> >>> static struct taskqueue *vmbus_get_eventtq_method(device_t, >device_t, >> >>> int); -#ifdef EARLY_AP_STARTUP >> >>> +#if defined(EARLY_AP_STARTUP) || defined(__aarch64__) >> >>> static void vmbus_intrhook(void *); >> >>> #endif >> >>> >> >> >> >>My gut reaction to this is that this is a red flag. EARLY_AP_STARTUP >> >>implies characteristics that aarch64 doesn't exhibit; it's hard to >> >>see why this conditional is OK, or whether it's testing >> >>EARLY_AP_STARTUP as a bad way to write __i386__ || __amd64__. >> >[Souradeep] We had or'd the aarch64 here, as we wanted to defer the >> >vmbus attachment after gic and arm generic timer is attached. As we >> >need PAUSE and DELAY for vmbus and hyper-v drivers. For that we >> >wanted to use vmbus_intrhook(). >> >> >> >>> [... snip ...] >> >>> @@ -760,7 +736,7 @@ vmbus_synic_setup(void *xsc) >> >>> >> >>> if (hyperv_features & CPUID_HV_MSR_VP_INDEX) { >> >>> /* Save virtual processor id. */ >> >>> - VMBUS_PCPU_GET(sc, vcpuid, cpu) = rdmsr(MSR_HV_VP_INDEX); >> >>> + VMBUS_PCPU_GET(sc, vcpuid, cpu) = >> >>> + RDMSR(MSR_HV_VP_INDEX); >> >>> } else { >> >>> /* Set virtual processor id to 0 for compatibility. */ >> >>> VMBUS_PCPU_GET(sc, vcpuid, cpu) = 0; >> >> >> >>This one, vmbus_synic_setup(), is invoked via vmbus_intrhook -> >> >>vmbus_doattach - >> >>> smp_rendezvous. On !EARLY_AP_STARTUP (e.g., aarch64), SMP isn't >> >>> functional in >> >>intrhooks and smp_rendezvous() will just call vmbus_synic_setup() on >> >>the boot >> >AP. >> >>There's nothing that will initialize the pcpu data on every other AP, AFAICT. >> >> >> >>That said, the !EARLY_AP_STARTUP path is also wrong. Quoting lines >> >>that weren't in this e-mail from vmbus_attach(): >> >[Souradeep] Thanks! This is a really good catch, and yes, we are >> >missing the smp initialization during vmbus synic setup. >> >> >> >>1527 #else /* !EARLY_AP_STARTUP */ >> >>1528 /* >> >>1529 * If the system has already booted and thread >> >>1530 * scheduling is possible indicated by the global >> >>1531 * cold set to zero, we just call the driver >> >>1532 * initialization directly. >> >>1533 */ >> >>1534 if (!cold) >> >>1535 vmbus_doattach(vmbus_sc); >> >>1536 #endif /* EARLY_AP_STARTUP and aarch64 */ >> >> >> >>The two immediate issues I see is that in a device_attach, SMP won't >> >>be started. It's also going to be before SI_SUB_CONFIGURE, so `cold` >> >>will never be 0 here -- this is effectively dead code, and if it >> >>weren't dead code it would suffer the same problem as above where >> >>other APs aren't started yet so we won't set their pcpu data. This >> >>is OK, though, because then we have this sysinit later on that does >> >>the same thing for !EARLY_AP_STARTUP && !__aarch64__`, but that >> >>should really just be `!EARLY_AP_STARTUP` as aarch64 also needs to >> >>invoke >> >>vmbus_doattach() at SI_SUB_SMP (much later than SI_SUB_DRIVERS). >> >[Souradeep] I have tried to use the suggestion but in multi processor >> >system the boot is getting stuck at the end and "vmbus0: device scan, probe and >attach done" >> >is not happening. >> >I am trying to figure out what is going wrong here, as the same >> >change is working with single cpu system. Any help or pointer will be really >helpful. >> [Souradeep] I can see the problem the task vmbus_scandone_task is >> getting enqueued but it is not getting executed. >> >> 525 static void >> 526 vmbus_scan_done(struct vmbus_softc *sc, >> 527 const struct vmbus_message *msg __unused) >> 528 { >> 529 device_printf(sc->vmbus_dev, " vmbus_scan_done is called\n"); >> 530 >> 531 taskqueue_enqueue(sc->vmbus_devtq, &sc->vmbus_scandone_task); >> 532 } >> vmbus scan done is getting called. >> >> But >> 514 static void >> 515 vmbus_scan_done_task(void *xsc, int pending __unused) >> 516 { >> 517 struct vmbus_softc *sc = xsc; >> 518 device_printf(sc->vmbus_dev, " vmbus_scan_done_task is called\n"); >> 519 bus_topo_lock(); >> 520 sc->vmbus_scandone = true; >> 521 bus_topo_unlock(); >> 522 wakeup(&sc->vmbus_scandone); >> 523 } >> vmbus_scan_done_task is not getting called the task . >> >> But the same is happening when single cpu is running, for multi cpu it is getting >stuck. > >Hi, > >That seems odd. What happens if you bump the SYSINIT up to SI_SUB_SMP >+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are ready >for scheduling until after smp_after_idle_runnable(), which is also at SI_ORDER_ANY >-- maybe there's just something going horribly wrong. >That would perhaps explain why it's fine on a single processor system, which won't >do anything useful (at least in later parts of SI_SUB_SMP). [Souradeep] In ARM64 SMP(VM with two cpu), storvsc attach is happening two times for single scsi controller. But for intel similar VM (two cpu), it is happening once. For the dummy/fake storvsc in arm64, we are getting stuck at device_attach. Details: vmbus_scan_done(), not getting invoked because vmbus_add_child() is not complete for a channel 15, because of which vmbus_devtq is having one task pending. Now By passing NMI in the hung system, after examining all threads: sched_switch() at sched_switch+0x4dc mi_switch() at mi_switch+0x194 sleepq_switch() at sleepq_switch+0xfc _cv_wait() at _cv_wait+0x160 _sema_wait() at _sema_wait+0x50 storvsc_attach() at storvsc_attach+0x610 device_attach() at device_attach+0x3f8 device_probe_and_attach() at device_probe_and_attach+0x7c vmbus_add_child() at vmbus_add_child+0x64 Now , It is stuck at waiting on sema_wait() on request->synch_sema in hv_storvsc_channel_init() because sema_post() on request->synch_sema is not getting invoked. Which unlocks it. This is because we are waiting on sema_wait on synch_sema hv_storvsc_channel_init(), for storvsc1 , but there is no storvsc1 device. So not getting a callback called for storvsc1. From ARM64 debug log: If you see at line 545 again SCI device got detected. Line 370: storvsc0: Enlightened SCSI device detected Line 371: storvsc0: <Hyper-V SCSI> on vmbus0 Line 406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6 Line 421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device Line 436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device Line 443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI device Line 447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI device Line 545: storvsc1: Enlightened SCSI device detected Line 547: storvsc1: Enlightened SCSI device detected Line 549: storvsc1: <Hyper-V SCSI>hv_storvsc_on_channel_callback is called From Log: unknown: device_add_child for chan15 storvsc1: Enlightened SCSI device detected storvsc1: Enlightened SCSI device detected storvsc1: <Hyper-V SCSI> on vmbus0 storvsc ringbuffer size: 262144, max_io: 512 storvsc1: chan15 assigned to cpu1 [vcpu1] hn0: link state changed to UP vmbus0: vmbus_chanmsg_handle type 0xa storvsc1: gpadl_conn(chan15) succeeded vmbus0: vmbus_chanmsg_handle type 0x6 storvsc1: chan15 opened waiting on sema wait synch_sema hv_storvsc_channel_init > >Thanks, > >Kyle Evans