Re: Running Mezzano in bhyve
- Reply: Peter Grehan : "Re: Running Mezzano in bhyve"
- In reply to: Vasily Postnicov : "Re: Running Mezzano in bhyve"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 10 Oct 2024 19:43:28 UTC
I suspect PCI interrupts are not functioning correctly. Look at this code: ;; Attach interrupt handler. (sup:debug-print-line "Handler: " (ahci-irq-handler ahci)) (sup:irq-attach (sup:platform-irq (pci:pci-intr-line location)) (ahci-irq-handler-function ahci) ahci) and this (defun pci-intr-line (device) (pci-config/8 device +pci-config-intr-line+)) ;; comment by me: the constant is #x3c I found that "PCI 0x3c" means PCI interrupt pin. AFAIK, interrupt pins are not supported by bhyve, is that correct? If it's true, I need either to teach bhyve how to deal with legacy interrupts or to teach Mezzano to understand MSI. What would be easier in your opinion? чт, 10 окт. 2024 г. в 17:12, Vasily Postnicov <shamaz.mazum@gmail.com>: > I was able to fix panics in both virtio and AHCI. This is what I found: > > 1) Virtio had a stupid bug, namely Mezzano tried to find an accessor to > some IO port in the runtime doing something like (funcall (intern (format > nil "~a-~a" bus-name slot-name)) ...). Surely, the creator made an error in > the name of one of the accessors, so FUNCALL tried to call an unbound > symbol, hence the page fault. > 2) AHCI had the following code: > > ;; Magic hacks for Intel devices? > ;; Set port enable bits in Port Control and Status on Intel controllers. > (when (eql (pci:pci-config/16 location pci:+pci-config-vendorid+) #x8086) > (let* ((n-ports (1+ (ldb (byte +ahci-CAP-NP-size+ +ahci-CAP-NP-position+) > (ahci-global-register ahci > +ahci-register-CAP+)))) > (pcs (pci:pci-config/16 location #x92))) > (setf (pci:pci-config/16 location #x92) (logior pcs > (ash #xFF (- (- 8 > n-ports))))))) > > I checked the value of N-PORTS, it's 20, so (ash #xff (- (- 8 n-ports))) > is 1044480 which is bigger than 2^16-1. I recompiled bhyve with MAX_PORTS = > 6 in bhyve/pci_ahci.c and the panic disappeared. Now I have this output: > > Detected AHCI ABAR at C1002000 > AHCI IRQ is B > Host Capabilities FF30FF25 > Global Host Control 80000000 > Interrupt Status 0 > Ports Implemented 1 > Version 10300 > Command Completion Coalescing Control 0 > Command Completion Coalescing Ports 0 > Enclosure Management Location 0 > Enclosure Management Control 0 > Host Capabilities Extended 4 > BIOS/OS Handoff Control and Status 0 > AHCI HBA version 1.300 > Handler: 0 > Config register: 17 > Port 0 > Waiting for CR/FR to stop. > Allocated port data at 105C33000 > Command List at 105C33000 > Received FIS at 105C33400 > Command Tabl at 105C33500 > Initializing device on port 0 > Command List Base Address 5C33000 > Command List Base Address Upper 32-bits 1 > FIS Base Address 5C33400 > FIS Base Address Upper 32-bits 1 > Interrupt Status 0 > Interrupt Enable 7D80003F > Command and Status 1C017 > Task File Data 50 > Signature 101 > SATA Status (SCR0: SStatus) 133 > SATA Control (SCR2: SControl) 300 > SATA Error (SCR1: SError) 0 > SATA Active (SCR3: SActive) 0 > Command Issue 0 > SATA Notification (SCR4: SNotification) 0 > FIS-based Switching Control 0 > *** AHCI-RUN-COMMAND TIMEOUT EXPIRED! *** > Command completed. > 105C33600: 28A20040 100000 0 3F > 105C33610: 0 59564248 4644452D 2D413239 > 105C33620: 382D4136 39433646 0 30300000 > 105C33630: 20203120 42482020 45205956 54415341 > 105C33640: 49532044 20204B20 20202020 20202020 > 105C33650: 20202020 20202020 20202020 80802020 > 105C33660: B000000 4000 60000 0 > 105C33670: 0 0 A00000 70000 > 105C33680: 780003 780078 40200078 0 > 105C33690: 0 1F0000 40010E 0 > 105C336A0: 2803F0 74004068 40684000 4000B400 > 105C336B0: 7F 0 0 0 > 105C336C0: 0 0 A00000 0 > 105C336D0: 10000 6008 0 0 > 105C336E0: 0 0 0 40080000 > 105C336F0: 4008 0 0 0 > 105C33700: 0 0 0 0 > 105C33710: 0 0 0 0 > 105C33720: 0 0 0 0 > 105C33730: 0 0 0 0 > 105C33740: 0 0 0 0 > 105C33750: 10000 0 0 0 > 105C33760: 0 0 0 0 > 105C33770: 0 0 0 0 > 105C33780: 0 0 0 0 > 105C33790: 0 0 0 0 > 105C337A0: 40000000 0 0 0 > 105C337B0: 0 0 0 1020 > 105C337C0: 0 0 0 0 > 105C337D0: 0 0 0 0 > 105C337E0: 0 0 0 0 > 105C337F0: 0 0 0 78A50000 > Features (83): 7400 > Sector size: 200 > Sector count: A00000 > Serial: BHYVE-FD29-AA68-6F9C > Model: BHYVE SATA DISK > Registered new R/W disk #<149CAC9> sectors:A00000 > Host Capabilities FF30FF25 > Global Host Control 80000002 > Interrupt Status 1 > Ports Implemented 1 > Version 10300 > Command Completion Coalescing Control 0 > Command Completion Coalescing Ports 0 > Enclosure Management Location 0 > Enclosure Management Control 0 > Host Capabilities Extended 4 > BIOS/OS Handoff Control and Status 0 > PCI:0:0:0 1022:7432 NIL - NIL 6:0:0 rid: 0 hdr: 0 intr: FF > 40: Unknown capability 10 > *** AHCI-RUN-COMMAND TIMEOUT EXPIRED! *** > *** AHCI-RUN-COMMAND TIMEOUT EXPIRED! *** > Detected MBR style parition table on disk #<149CAC9> > Detected partition 0 on disk #<149CAC9>. Start: 800 size: 800 > Registered new R/W disk #<149CCD9> sectors:800 > Detected partition 1 on disk #<149CAC9>. Start: 1000 size: 800 > Registered new R/W disk #<149CD89> sectors:800 > Detected partition 2 on disk #<149CAC9>. Start: 2000 size: 9FE000 > Registered new R/W disk #<149CE39> sectors:9FE000 > Looking for paging disk with UUID > 5C:F6:EE:79:2C:DF:45:E1:BA:2B:63:25:C4:1A:5F:10 > *** AHCI-RUN-COMMAND TIMEOUT EXPIRED! *** > Found image with UUID 5C:F6:EE:79:2C:DF:45:E1:BA:2B:63:25:C4:1A:5F:10 on > disk #<149CE39> > Found boot image on disk #<149CE39>! > BML4 at -7FFFFFEFD000 > Store freelist block is 2 > > It seems it is booting, but very very slowly with those "TIMEOUT EXPIRED" > messages. For virtio-blk, it's almost the same with an exception that it > hangs completely. I'll try to investigate further. Meanwhile, can you make > any suggestions why those magic intel AHCI controller hacks are required > and why sc->ports can get bigger than DEF_PORTS in pci_ahci_init in bhyve? >