Re: bhyve/passthru for Intel dGPU (ARC A380)?

From: Corvin Köhne <corvink_at_FreeBSD.org>
Date: Mon, 06 Jan 2025 14:40:59 UTC
On Mon, 2025-01-06 at 13:56 +0000, Peter Wood wrote:
> Thanks for the feedback Corvin, and thank you for the hard work you've been
> putting into GPU passthru.
> 
> I'm reaching the end of my limited knowledge here, and I have no expectation
> of any further assistance - but as a status update, BIOS in CSM (with legacy
> video op rom, so I see the console).
> 
> > If you want to pass the option rom to the guest, you can use the rom option
> > of
> > passthru devices:
> > 
> > -s 1/2/3,passthru,1/2/3,rom=/path/to/rom
> > 
> 
> 
> I extracted the option ROM using linux, I was able to use the
> /sys/devices/pci*/rom route to extract it, it seems valid at a glance (768k
> dump) - but no idea how to really tell.
> 
> Using the patched bhyve executable to bypass gvt-d: -s
> 4/0/0,passthru,4/0/0,rom=/mnt/vm/intel-arc-a380.bin -s 5/0/0,passthru,5/0/0
> (5/0/0 is a separate audio device exposing the audio channels of the HDMI
> ports).
> 
> Sadly initialization of the GPU in the linux (Ubuntu 24.04 / linux 6.8.0) VM
> still fails:
> [    2.508656] i915 0000:00:04.0: enabling device (0000 -> 0002)
> [    2.520226] i915 0000:00:04.0: [drm] Local memory IO size:
> 0x000000017c800000
> [    2.520232] i915 0000:00:04.0: [drm] Local memory available:
> 0x000000017c800000
> [    2.540148] i915 0000:00:04.0: vgaarb: VGA decodes changed:
> olddecodes=io+mem,decodes=none:owns=none
> [    2.550829] i915 0000:00:04.0: [drm] Finished loading DMC firmware
> i915/dg2_dmc_ver2_08.bin (v2.8)
> [    2.564885] i915 0000:00:04.0: [drm] GT0: GUC: ADS capture alloc size
> changed from 32768 to 36864
> [    2.565855] i915 0000:00:04.0: [drm] GT0: GuC firmware i915/dg2_guc_70.bin
> version 70.20.0
> [    2.565859] i915 0000:00:04.0: [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin
> version 7.10.3
> [    2.565979] i915 0000:00:04.0: [drm] GT0: GUC: ADS capture alloc size
> changed from 32768 to 36864
> [    2.567001] i915 0000:00:04.0: [drm] GT0: GUC: load failed: status =
> 0x40000056, time = 0ms, freq = 2300MHz, ret = 0
> [    2.567006] i915 0000:00:04.0: [drm] GT0: GUC: load failed: status: Reset =
> 0, BootROM = 0x2B, UKernel = 0x00, MIA = 0x00, Auth = 0x01
> [    2.567009] i915 0000:00:04.0: [drm] GT0: GUC: firmware production part
> check failure
> [    2.567077] i915 0000:00:04.0: [drm] *ERROR* GT0: GuC initialization failed
> -ENOEXEC
> [    2.567610] i915 0000:00:04.0: [drm] *ERROR* GT0: Enabling uc failed (-5)
> [    2.567949] i915 0000:00:04.0: [drm] *ERROR* GT0: Failed to initialize GPU,
> declaring it wedged!
> [    2.570106] i915 0000:00:04.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9
> by intel_gt_set_wedged_on_init+0x34/0x50 [i915]
> [    2.587048] [drm] Initialized i915 1.6.0 20230929 for 0000:00:04.0 on minor
> 1
> 

If possible, it might be a good idea to check if it's running on a Linux host
with QEMU properly. If yes, we may be able to check if QEMU has some special
quirks for those devices (don't see one yet).

> Interestingly intel_gpu_top will interact with the card to a degree, it shows
> a render utilization (of 0%), but none of the other card capabilities. There
> are some very similar errors in Google which may suggest it's may not be a
> bhyve/passthru issue, though it could be. I need to spin up a new VM with more
> bleeding edge linux (or maybe even Win11) to see if it can talk to the card.
> 
> https://github.com/intel-analytics/ipex-llm/issues/12122
> 
> I'll post if I get any further, but I suspect this is the end for now.
> 

Hmm, the issues you've posted is related to resizable BARs. I'm not familiar
with it but afaik, bhyve isn't able to emulate resizable BARs yet.

Btw. resizable BARs are somehow supported by QEMU, so it might be worth giving
it a try:

https://gitlab.com/qemu-project/qemu/-/commit/b5048a4cbfa0362abc720b5198fe9a35441bf5fe

> Peter.
> -- 
> Peter Wood
> peter@alastria.net
> 

-- 
Kind regards,
Corvin