i915: RCS timing out when being idled

From: obiwac <obiwac_at_gmail.com>
Date: Sat, 31 Dec 2022 17:57:51 UTC
Hey,

I didn't find a more appropriate mailing list to post to, so here it goes:

I'm running what is essentially FreeBSD-CURRENT on an Asus C300 Chromebook
(iGPU, gen 7).

Building branch 5.10-lts (or HEAD on main for that matter) of drm-kmod and
loading i915kms results in a wedged GPU, after a call to
intel_gt_wait_for_idle fails in __engines_record_defaults in
drivers/gpu/drm/i915/gt/intel_gt.c.

It fails waiting for the RCS0 engine; not loading a new context to it (i.e.
adding a few 'if (id == RCS0) continue;' lines to ignore it) allows the GPU
to continue initialisation without wedging. This isn't ideal though because
the RCS ends up unhappy after a bit of load and the GPU hangs (RCS engine
crashes) :P

I guess the issues (the wedging and the hang) could be unrelated but I have
a strong suspicion they are.

I've been trying to understand how the whole i915 stuff is architectured
for a couple days now (Intel's vocabulary is very confusing ngl), but there
are a few things I can't really wrap my head around, which leads me asking
for help debugging here 😄

Is there anything else I can try in terms of troubleshooting/anyone else I
can contact for help? If not I wouldn't really mind attempting to
understand everything through-and-through and fix the issue myself, if
there was someone I could ask for a couple short explanations on bits of
the driver ;)

Last I checked, everything was working well with drm-tip on Linux.

Kind regards and a happy new year,
Aymeric 🥂