RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: enablement for ARM64 in Hyper-V (Part 3, final)
Date: Mon, 08 May 2023 12:08:19 UTC
>-----Original Message----- >From: Kyle Evans <kevans@freebsd.org> >Sent: Thursday, April 27, 2023 5:54 AM >To: Souradeep Chakrabarti <schakrabarti@microsoft.com> >Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org; dev-commits-src- >all@freebsd.org; dev-commits-src-main@freebsd.org >Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >enablement for ARM64 in Hyper-V (Part 3, final) > >On Wed, Apr 26, 2023 at 4:37 PM Souradeep Chakrabarti ><schakrabarti@microsoft.com> wrote: >> >> >> >> >> >-----Original Message----- >> >From: Souradeep Chakrabarti >> >Sent: Thursday, April 27, 2023 2:01 AM >> >To: 'Kyle Evans' <kevans@freebsd.org> >> >Cc: 'Wei Hu' <whu@freebsd.org>; 'src-committers@freebsd.org' <src- >> >committers@freebsd.org>; 'dev-commits-src-all@freebsd.org' >> ><dev-commits-src- all@freebsd.org>; >> >'dev-commits-src-main@freebsd.org' <dev-commits-src- >> >main@freebsd.org> >> >Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >> >enablement for ARM64 in Hyper-V (Part 3, final) >> > >> > >> > >> > >> >>-----Original Message----- >> >>From: Souradeep Chakrabarti >> >>Sent: Wednesday, April 26, 2023 7:26 PM >> >>To: Kyle Evans <kevans@freebsd.org> >> >>Cc: Wei Hu <whu@freebsd.org>; src-committers@freebsd.org; >> >>dev-commits-src- all@freebsd.org; dev-commits-src-main@freebsd.org >> >>Subject: RE: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >> >>enablement for ARM64 in Hyper-V (Part 3, final) >> >> >> >> >> >> >> >> >> >>>-----Original Message----- >> >>>From: Kyle Evans <kevans@freebsd.org> >> >>>Sent: Wednesday, April 26, 2023 3:39 AM >> >>>To: Souradeep Chakrabarti <schakrabarti@microsoft.com> >> >>>Cc: Kyle Evans <kevans@freebsd.org>; Wei Hu <whu@freebsd.org>; src- >> >>>committers@freebsd.org; dev-commits-src-all@freebsd.org; >> >>>dev-commits-src- main@freebsd.org >> >>>Subject: Re: [EXTERNAL] Re: git: 9729f076e4d9 - main - arm64: Hyper-V: >> >>>enablement for ARM64 in Hyper-V (Part 3, final) >> [... snip ...] >> >>>Hi, >> >>> >> >>>That seems odd. What happens if you bump the SYSINIT up to >> >>>SI_SUB_SMP >> >>>+ 1, SI_ORDER_FIRST? We don't know for a fact that all APs are >> >>>+ ready >> >>>for scheduling until after smp_after_idle_runnable(), which is also >> >>>at SI_ORDER_ANY >> >>>-- maybe there's just something going horribly wrong. >> >>>That would perhaps explain why it's fine on a single processor >> >>>system, which won't do anything useful (at least in later parts of >SI_SUB_SMP). >> >>[Souradeep] >> >>In ARM64 SMP(VM with two cpu), storvsc attach is happening two >> >>times for single scsi controller. >> >>But for intel similar VM (two cpu), it is happening once. >> >>For the dummy/fake storvsc in arm64, we are getting stuck at device_attach. >> >> >> >>Details: >> >> >> >>vmbus_scan_done(), not getting invoked because vmbus_add_child() is >> >>not complete for a channel 15, because of which vmbus_devtq is >> >>having one task pending. >> >> >> >>Now >> >>By passing NMI in the hung system, after examining all threads: >> >> >> >>sched_switch() at sched_switch+0x4dc >> >>mi_switch() at mi_switch+0x194 >> >>sleepq_switch() at sleepq_switch+0xfc >> >>_cv_wait() at _cv_wait+0x160 >> >>_sema_wait() at _sema_wait+0x50 >> >>storvsc_attach() at storvsc_attach+0x610 >> >>device_attach() at device_attach+0x3f8 >> >>device_probe_and_attach() at device_probe_and_attach+0x7c >> >>vmbus_add_child() at vmbus_add_child+0x64 >> >> >> >>Now , >> >> >> >>It is stuck at waiting on sema_wait() on request->synch_sema in >> >>hv_storvsc_channel_init() because >> >>sema_post() on request->synch_sema is not getting invoked. Which unlocks it. >> >>This is because we are waiting on sema_wait on synch_sema >> >>hv_storvsc_channel_init(), for storvsc1 , but there is no storvsc1 >> >>device. So not getting a callback called for storvsc1. >> >> >> >>From ARM64 debug log: >> >>If you see at line 545 again SCI device got detected. >> >> >> >> Line 370: storvsc0: Enlightened SCSI device detected >> >> Line 371: storvsc0: <Hyper-V SCSI> on vmbus0 >> >> Line 406: (probe0:storvsc0:0:0:0): storvsc scsi_status = 2, srb_status = 6 >> >> Line 421: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device >> >> Line 436: da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device >> >> Line 443: pass1: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM >> >>SPC-3 SCSI device >> >> Line 447: cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM >> >>SPC-3 >> >SCSI >> >>device >> >> Line 545: storvsc1: Enlightened SCSI device detected >> >> Line 547: storvsc1: Enlightened SCSI device detected >> >> Line 549: storvsc1: <Hyper-V >> >>SCSI>hv_storvsc_on_channel_callback is called >> >> >> >>From Log: >> >> >> >>unknown: device_add_child for chan15 >> >>storvsc1: Enlightened SCSI device detected >> >>storvsc1: Enlightened SCSI device detected >> >>storvsc1: <Hyper-V SCSI> on vmbus0 >> >>storvsc ringbuffer size: 262144, max_io: 512 >> >>storvsc1: chan15 assigned to cpu1 [vcpu1] >> >>hn0: link state changed to UP >> >>vmbus0: vmbus_chanmsg_handle type 0xa >> >>storvsc1: gpadl_conn(chan15) succeeded >> >>vmbus0: vmbus_chanmsg_handle type 0x6 >> >>storvsc1: chan15 opened >> >>waiting on sema wait synch_sema hv_storvsc_channel_init >> >[Souradeep] The fix is working, the test bed had an issue, after >> >fixing that the fix is working. >> >I will share the fix by this week. >> [Souradeep] Small update, the problem happens only if there is an >> extra SCSI controller on the system. Then it fails to attach storvsc for that SCSI. >> > >Excellent! Knowing now what configuration causes it; does it reproduce on x86 as >well with an extra SCSI controller? I'd expect so, but maybe [Souradeep] In x86 this problem is not seen. After doing more detailed debugging, looks like the interrupt coming to CPU1 are not getting handled or CPU1 not getting the interrupt in IRQ 18,, which is used by Hyper-V to notify guest on incoming message on any channel. I checked vmstat -I, and it seems vmbus using gic0, p2 in amr64. It looks to me, vmbus intr handler not getting called for CPU1 if IRQ is coming to CPU1 irq 18. Do we have anything in FreeBSD arm64 to enable an IRQ in every CPU? # vmstat -i interrupt total rate gic0,p2: vmbus0 3820 49 gic0,p4:-ric_timer0 2643 34 gic0,s1: uart0 2990 38 cpu0:ast 1 0 cpu1:ast 4 0 cpu0:preempt 3913 50 cpu1:preempt 4890 62 cpu0:rendezvous 2 0 cpu1:rendezvous 4 0 cpu0:hardclock 1 0 Total 18268 232 >not- > >Thanks, > >Kyle Evans