Re: Running Mezzano in bhyve

From: Vasily Postnicov <shamaz.mazum_at_gmail.com>
Date: Tue, 15 Oct 2024 05:15:32 UTC
Regarding items 3) and 4):

3) Indeed, bhyve does not explicitly forbid writing to 0x3c. I meant the
following. The interrupt line is set is pci_emul.c in bhyve:
 pci_set_cfgdata8(pi, PCIR_INTLINE, pirq_irq(ii->ii_pirq_pin));
Bhyve asserts interrupts with pci_irq_assert in amd64/pci_irq.c. We need
this line: vm_isa_assert_irq(pi->pi_vmctx, pirq->reg & PIRQ_IRQ,
pi->pi_lintr.ioapic_irq);
pirq->reg & PIRQ_IRQ is literally the same as pirq_irq(ii->ii_pirq_pin).
Now, if something (e.g. UEFI firmware, bootloader) writes to PCIR_INTLINE
bhyve will still send interrupts with the number that was there before the
write, while the OS will expect an interrupt with the new number. I treat
this as a bug in bhyve (but it affects nobody, because newer OSes do not
use the 8259 interrupt controller).

4) It's commenting the lock what makes an effect. I commented
pci_generate_msi just in case because it's not needed for Mezzano, but runs
protected by the mutex which is now gone.
This is a backtrace and thread list when bhyve hangs up if the mutex is not
commented out:

(lldb) bt
* thread #1, name = 'mevent', stop reason = signal SIGSTOP
  * frame #0: 0x000011adeaa37e2a libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38
    frame #1: 0x000011adeaa479c0
libthr.so.3`__thr_umutex_lock(mtx=0x0000378ecca00888, id=101223) at
thr_umtx.c:79:3
    frame #2: 0x000011adeaa40eea
libthr.so.3`mutex_lock_sleep(curthread=0x0000378ecc412000,
m=0x0000378ecca00888, abstime=0x0000000000000000) at thr_mutex.c:699:9
    frame #3: 0x000011adeaa3ed8f libthr.so.3`__Tthr_mutex_lock [inlined]
mutex_lock_common(m=0x0000378ecca00888, abstime=0x0000000000000000,
cvattach=false, rb_onlist=false) at thr_mutex.c:733:9
    frame #4: 0x000011adeaa3ed4d
libthr.so.3`__Tthr_mutex_lock(mutex=<unavailable>) at thr_mutex.c:752:9
    frame #5: 0x000011a5c43e7b06 bhyve`vi_interrupt(vs=0x0000378ecc4b8000,
isr='\x01', msix_idx=65535) at virtio.h:358:3
    frame #6: 0x000011a5c43e6c86 bhyve`vq_interrupt(vs=0x0000378ecc4b8000,
vq=0x0000378ecc4b8038) at virtio.h:376:2
    frame #7: 0x000011a5c43e6c44 bhyve`vq_endchains(vq=0x0000378ecc4b8038,
used_all_avail=0) at virtio.c:512:3
    frame #8: 0x000011a5c43db348 bhyve`pci_vtnet_rx(sc=0x0000378ecc4b8000)
at pci_virtio_net.c:271:4
    frame #9: 0x000011a5c43dab53 bhyve`pci_vtnet_rx_callback(fd=6,
type=EVF_READ, param=0x0000378ecc4b8000) at pci_virtio_net.c:403:2
    frame #10: 0x000011a5c43bb9f8
bhyve`mevent_handle(kev=0x000011ade4451200, numev=1) at mevent.c:273:3
    frame #11: 0x000011a5c43bb5d7 bhyve`mevent_dispatch at mevent.c:549:3
    frame #12: 0x000011a5c43aed4b bhyve`main(argc=1,
argv=0x000011ade4453418) at bhyverun.c:1052:2
    frame #13: 0x000011adec6c1a6a libc.so.7`__libc_start1(argc=24,
argv=0x000011ade4453360, env=0x000011ade4453428, cleanup=<unavailable>,
mainX=(bhyve`main at bhyverun.c:694)) at libc_start1.c:157:7
    frame #14: 0x000011a5c43a80cd bhyve`_start at crt1_s.S:83

(lldb) frame select 5
frame #5: 0x000011a5c43e7b06 bhyve`vi_interrupt(vs=0x0000378ecc4b8000,
isr='\x01', msix_idx=65535) at virtio.h:358:3
   355 if (pci_msix_enabled(vs->vs_pi))
   356 pci_generate_msix(vs->vs_pi, msix_idx);
   357 else {
-> 358 VS_LOCK(vs);
   359 vs->vs_isr |= isr;
   360 pci_generate_msi(vs->vs_pi, 0);
   361 #ifdef __amd64__

(lldb) thread list
Process 3185 stopped
* thread #1: tid = 101223, 0x000011adeaa37e2a libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'mevent', stop reason = signal SIGSTOP
  thread #2: tid = 101868, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-0', stop reason = signal SIGSTOP
  thread #3: tid = 101869, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-1', stop reason = signal SIGSTOP
  thread #4: tid = 101870, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-2', stop reason = signal SIGSTOP
  thread #5: tid = 101871, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-3', stop reason = signal SIGSTOP
  thread #6: tid = 101872, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-4', stop reason = signal SIGSTOP
  thread #7: tid = 101873, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-5', stop reason = signal SIGSTOP
  thread #8: tid = 101874, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-6', stop reason = signal SIGSTOP
  thread #9: tid = 101875, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'blk-3:0-7', stop reason = signal SIGSTOP
  thread #10: tid = 101876, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'vtnet-5:0 tx', stop reason = signal SIGSTOP
  thread #11: tid = 101877, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'hda-audio-output', stop reason = signal SIGSTOP
  thread #12: tid = 101878, 0x000011adec7752ea libc.so.7`__sys_accept at
_accept.S:4, name = 'rfb', stop reason = signal SIGSTOP
  thread #13: tid = 101879, 0x000011adec7726aa libc.so.7`__sys_ioctl at
ioctl.S:4, name = 'vcpu 0', stop reason = signal SIGSTOP
  thread #14: tid = 101880, 0x000011adeaa37e2c libthr.so.3`_umtx_op_err at
_umtx_op_err.S:38, name = 'vcpu 1', stop reason = signal SIGSTOP

I think implementing IOAPIC in MEzzano is the best option indeed, but I
have a little experience. I'll see what I can do.

пн, 14 окт. 2024 г. в 22:52, Peter Grehan <grehan@freebsd.org>:

> > 1) The problem with PIT. Can be solved as you proposed or by
> > patching Mezzano. The bhyve patch would be the best option for that:
> it's useful for
> other older o/s's (DOS).
>
> > 2) Mezzano assumes that Intel AHCI controllers report no more than 6
> > ports. Can be solved by patching Mezzano or defining MAX_PORTS to be
> > 6 in usr.sbin/bhyve/pci_ahci.c
>
>   A Mezzano patch would be best for that. The bhyve man page has an
> example with 8 disks attached so reducing the limit to 6 could hit
> existing users.
>
> > 3) According to
> > https://wiki.osdev.org/PCI#Message_Signaled_Interrupts
> > <https://wiki.osdev.org/PCI#Message_Signaled_Interrupts>, interrupt
> > line config register must be RW. Bhyve does not support writing to
> > it. I do not know a correct fix, this [1] workaround helps, however.
>
>   Bhyve does support writing to that - your patch disables that, and my
> guess is that when Mezzano sees this as zero (ie invalid) it then looks
> for the irq line via the ACPI MADT (or other means).
>
>   A quick look at Mezzano shows that it is still using the 8259 PIC for
> interrupts. At the minimum it should be using the IOAPIC, or excessive
> interrupt sharing will result, and possibly incorrect behaviour when
> this happens. I think IOAPIC support could be added without a large
> amount of effort, compared to e.g. MSI/MSI-x.
>
> > 4) Finally, I had a random deadlock in interrupt handling for the
> > virtio-net device. Likewise, I do not know how to fix it correctly,
> > but this [2] patch helped.
>
>   Hmmm that seems strange: MSI interrupts aren't generated if they
> haven't been setup/enabled by a guest. Commenting out the lock/unlock
> code would seem to indicate a larger bug in play. Would it possible to
> get some tracing on that segment of code e.g. a dtrace log ?
>
> > Do you have any ideas how to make proper patches for bhyve from
> > these workarounds?
>
>   The first one can be put in a phab diff, which I'll do. I think there's
> still some more work involved for the others.
>
> later,
>
> Peter.
>
>
>
>