[Bug 270966] PCI passthru stops working after ~30 guest reboots (ivhd, ILLEGAL CMD, IO_PAGE_FAULT)
Date: Sun, 27 Aug 2023 15:48:50 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270966 Santiago Martinez <sm@codenetworks.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sm@codenetworks.net --- Comment #16 from Santiago Martinez <sm@codenetworks.net> --- Hi Raul, I'm seeing the same issue on AMD EPYC proc. Checking on kernel.org (Linux) seems that they also had issues with AMD-VI. In the Linux world, many people are using iommu=pt to overcome this. This is also a known bug on Redhat KB. I'm running a script similar to yours and the server behaves quite erratic. My script does the following: - Start and stop 200 times a VM with a PCI pass (in this case is a SRIOV VF, but it does the same without SRIOV, or with any other device, non-network related). - After that 200 times, it reboots the server. - When the server starts it runs the script again. Sometimes, the script can start and stop the VM 200 times, even if I see IVH errors (command not completed or cmd error), and sometimes can only start and stop the VM once, and the server reboots after a few IO_PAGE_FAULT (something gets corrupted and the NVME stops responding and machines reboots after command retry-timeout). The server showing the issue is a SuperMicro H12SSW-NT. - AMD EPYC 7552 48-Core Processor I have updated the BIOS to the latest release as on the Linux forum they mentioned issues with the SP3. Michael Dexter and I also tried to replicate it on other AMD processors without any success. - AMD EPYC 7702P 64-Core Processor - AMD Ryzen 7 3700X 8-Core Processor - Ryzen 6800H -- You are receiving this mail because: You are the assignee for the bug.