git: ee373c1234d3 - stable/14 - acpi_ged: Handle events directly

From: Andrew Gallatin <gallatin_at_FreeBSD.org>
Date: Mon, 28 Oct 2024 23:03:24 UTC
The branch stable/14 has been updated by gallatin:

URL: https://cgit.FreeBSD.org/src/commit/?id=ee373c1234d33d68e91fbe51f846e265cbce9e72

commit ee373c1234d33d68e91fbe51f846e265cbce9e72
Author:     Andrew Gallatin <gallatin@FreeBSD.org>
AuthorDate: 2023-10-12 15:15:06 +0000
Commit:     Andrew Gallatin <gallatin@FreeBSD.org>
CommitDate: 2024-10-28 23:02:24 +0000

    acpi_ged:  Handle events directly
    
    Handle ged interrupts directly from the interrupt handler,
    while the interrupt source is masked, so as to conform
    with the acpi spec, and avoid spurious interrupts and
    lockups on boot.
    
    When an acpi ged interrupt is encountered, the spec requires
    the os (as stated in 5.6.4: General Purpose Event Handling)
    to leave the interrupt source masked until it runs the
    EOI handler.  This is not a good fit for our method of
    queuing the work (including the EOI ack of the interrupt),
    via the AcpiOsExecute() taskqueue mechanism.
    
    Note this fixes a bug where an arm64 server could lock up if
    it encountered a ged interrupt at boot.  The lockup was
    due to running on a single core (due to arm64 not using
    EARLY_AP_STARTUP), and due to that core encountering a
    new interrupt each time the interrupt handler unmasked
    the interrupt source, and having the EOI queued on a taskqueue
    which never got a chance to run. This is also possible
    on any platform when using just a single processor.
    The symptom of this is a lockup at boot, with:
    "AcpiOsExecute: failed to enqueue task, consider
    increasing the debug.acpi.max_tasks tunable" scrolling
    on console.
    
    Similarly, spurious interrupts would occur when running
    with multiple cores, because it was likely that the
    interrupt would fire again immediately, before the
    ged task could be run, and before an EOI could be sent
    to lower the interrupt line.  I would typically see
    3-5 copies of every ged event due to this issue.
    
    This adds a tunable, debug.acpi.ged_defer, which can be
    set to 1 to restore the old behavior.  This was done
    because acpi is a complex system, and it may be
    theoretically possible something the ged handler does
    may sleep (though I cannot easily find anthing by inspection).
    
    MFC after: 1 month
    Reviewed by: andrew, jhb, imp
    Sponsored by: Netflix
    Differential Revision: https://reviews.freebsd.org/D42158
    
    (cherry picked from commit be91b4797e2c8f3440f6fe3aac7e246886f9ebca)
---
 sys/dev/acpica/acpi_ged.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/sys/dev/acpica/acpi_ged.c b/sys/dev/acpica/acpi_ged.c
index e7dcc1ffb0ac..23e125f277c5 100644
--- a/sys/dev/acpica/acpi_ged.c
+++ b/sys/dev/acpica/acpi_ged.c
@@ -81,6 +81,11 @@ static driver_t acpi_ged_driver = {
 DRIVER_MODULE(acpi_ged, acpi, acpi_ged_driver, 0, 0);
 MODULE_DEPEND(acpi_ged, acpi, 1, 1, 1);
 
+static int acpi_ged_defer;
+SYSCTL_INT(_debug_acpi, OID_AUTO, ged_defer, CTLFLAG_RWTUN,
+    &acpi_ged_defer, 0,
+    "Handle ACPI GED via a task, rather than in the ISR");
+
 static void
 acpi_ged_evt(void *arg)
 {
@@ -92,7 +97,12 @@ acpi_ged_evt(void *arg)
 static void
 acpi_ged_intr(void *arg)
 {
-	AcpiOsExecute(OSL_GPE_HANDLER, acpi_ged_evt, arg);
+	struct acpi_ged_event *evt = arg;
+
+	if (acpi_ged_defer)
+		AcpiOsExecute(OSL_GPE_HANDLER, acpi_ged_evt, arg);
+	else
+		AcpiEvaluateObject(evt->ah, NULL, &evt->args, NULL);
 }
 static int
 acpi_ged_probe(device_t dev)