git: b339ab149105 - main - ciss: Don't panic on null CR ciss_dequeue_notify
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 14 Oct 2024 05:41:02 UTC
The branch main has been updated by imp: URL: https://cgit.FreeBSD.org/src/commit/?id=b339ab1491055d89415f85b6d1a03423193178f9 commit b339ab1491055d89415f85b6d1a03423193178f9 Author: Peter Eriksson <pen@lysator.liu.se> AuthorDate: 2024-10-14 04:01:33 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2024-10-14 05:37:46 +0000 ciss: Don't panic on null CR ciss_dequeue_notify Apparently, sometimes on hot plug/unplug, a null cr comes back from ciss_dequeue_notify. This is clearly a bug, and by ignoring it we're papering over that bug. We only ever wake the thread after enqueing a notification or setting a bit about killing the thread, so once we check the bit isn't the cause, cr can't be NULL unless something else has dequeued it. Ideally, this would be fixed, rather than papered over, but this makes a very old card somewhat more useable for external enclosures. I suspect it's a race when we set CISS_THREAD_SHUT and another flag (the latter w/o ciss_mtx held), but I don't see it and w/o hardware to reproduce it would be hard to know for sure. PR: 246279 Reviewed by: imp Tested by: Marek Zarychta Differential Revision: https://reviews.freebsd.org/D25155 --- sys/dev/ciss/ciss.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/sys/dev/ciss/ciss.c b/sys/dev/ciss/ciss.c index d4786302d928..d4ede91f6b35 100644 --- a/sys/dev/ciss/ciss.c +++ b/sys/dev/ciss/ciss.c @@ -4207,8 +4207,20 @@ ciss_notify_thread(void *arg) cr = ciss_dequeue_notify(sc); - if (cr == NULL) - panic("cr null"); + if (cr == NULL) { + /* + * We get a NULL message sometimes when unplugging/replugging + * stuff But this indicates a bug, since we only wake this thread + * when we (a) set the THREAD_SHUT flag, or (b) we have enqueued + * something. Since it's reported around errors, it may be a + * locking bug related to ciss_flags being modified in multiple + * threads some without ciss_mtx held. Or there's some other + * way we either fail to sleep or corrupt the ciss_flags. + */ + ciss_printf(sc, "Driver bug: NULL notify event received\n"); + continue; + } + cn = (struct ciss_notify *)cr->cr_data; switch (cn->class) {