git: 55c3348ed78f - main - acpi_pci: Add quirk for DELAY-after-EJ0

From: Colin Percival <cperciva_at_FreeBSD.org>
Date: Fri, 14 Mar 2025 18:45:05 UTC
The branch main has been updated by cperciva:

URL: https://cgit.FreeBSD.org/src/commit/?id=55c3348ed78fb1d0891e8bb51a8948f95da3560b

commit 55c3348ed78fb1d0891e8bb51a8948f95da3560b
Author:     Colin Percival <cperciva@FreeBSD.org>
AuthorDate: 2025-03-06 05:22:33 +0000
Commit:     Colin Percival <cperciva@FreeBSD.org>
CommitDate: 2025-03-14 18:35:35 +0000

    acpi_pci: Add quirk for DELAY-after-EJ0
    
    On some EC2 instances, there is a race between removing a device from
    the system and making the PCI bus stop reporting the presence of the
    device.  As a result, a PCI BUS_RESCAN performed immediately after
    the _EJ0 method returns "sees" the device which is being ejected, which
    then causes problems later (e.g. we won't recognize a new device being
    plugged into that slot because we never knew it was vacant).
    
    On other operating systems the bus is synchronously marked as needing
    to be rescanned but the rescan does not occur until O(1) seconds later.
    
    Create a new ACPI_Q_DELAY_BEFORE_EJECT_RESCAN quirk and set it in EC2
    AMIs, and add a 10 ms DELAY between _EJ0 and BUS_RESCAN when tht quirk
    is set.
    
    Reviewed by:    jhb
    MFC after:      1 month
    Sponsored by:   Amazon
    Differential Revision:  https://reviews.freebsd.org/D49252
---
 release/tools/ec2.conf    | 5 +++--
 sys/dev/acpica/acpi_pci.c | 2 ++
 sys/dev/acpica/acpivar.h  | 3 +++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/release/tools/ec2.conf b/release/tools/ec2.conf
index a8fc3854a0e2..4f78e5913e56 100644
--- a/release/tools/ec2.conf
+++ b/release/tools/ec2.conf
@@ -75,8 +75,9 @@ ec2_common() {
 	# (in fact the PL061 has no pullup/pulldown resistors).  Graviton 1
 	# through Graviton 3 have non-functional PCI _EJ0 and need a value
 	# written to the PCI power status register in order to eject a
-	# device.
-	echo 'debug.acpi.quirks="24"' >> ${DESTDIR}/boot/loader.conf
+	# device.  EC2 instances with PCI (not PCIe) buses need a short
+	# delay before rescanning upon device detach.
+	echo 'debug.acpi.quirks="56"' >> ${DESTDIR}/boot/loader.conf
 
 	# Load the kernel module for the Amazon "Elastic Network Adapter"
 	echo 'if_ena_load="YES"' >> ${DESTDIR}/boot/loader.conf
diff --git a/sys/dev/acpica/acpi_pci.c b/sys/dev/acpica/acpi_pci.c
index 97704111839b..b7a2bf70b4e0 100644
--- a/sys/dev/acpica/acpi_pci.c
+++ b/sys/dev/acpica/acpi_pci.c
@@ -432,6 +432,8 @@ acpi_pci_device_notify_handler(ACPI_HANDLE h, UINT32 notify, void *context)
 			    acpi_name(h), AcpiFormatException(status));
 			return;
 		}
+		if (acpi_quirks & ACPI_Q_DELAY_BEFORE_EJECT_RESCAN)
+			DELAY(10 * 1000);
 		BUS_RESCAN(dev);
 		bus_topo_unlock();
 		break;
diff --git a/sys/dev/acpica/acpivar.h b/sys/dev/acpica/acpivar.h
index 830764434f48..d35504127c9c 100644
--- a/sys/dev/acpica/acpivar.h
+++ b/sys/dev/acpica/acpivar.h
@@ -232,6 +232,8 @@ extern struct mtx			acpi_mutex;
  *	as "PullUp" and they should be treated as "NoPull" instead.
  * ACPI_Q_CLEAR_PME_ON_DETACH: Specifies that PCIM_PSTAT_(PME & ~PMEENABLE)
  *	should be written to the power status register as part of ACPI Eject.
+ * ACPI_Q_DELAY_BEFORE_EJECT_RESCAN: Specifies that we need a short (10ms)
+ *	delay after _EJ0 returns before rescanning the PCI bus.
  */
 extern int	acpi_quirks;
 #define ACPI_Q_OK		0
@@ -240,6 +242,7 @@ extern int	acpi_quirks;
 #define ACPI_Q_MADT_IRQ0	(1 << 2)
 #define ACPI_Q_AEI_NOPULL	(1 << 3)
 #define ACPI_Q_CLEAR_PME_ON_DETACH	(1 << 4)
+#define ACPI_Q_DELAY_BEFORE_EJECT_RESCAN	(1 << 5)
 
 #if defined(__amd64__) || defined(__i386__)
 /*