git: 893be9d8ac16 - main - sleepqueue: Address a lock order reversal
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 14 Feb 2022 15:07:02 UTC
The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=893be9d8ac161c4cc96e9f3f12f1260355dd123b commit 893be9d8ac161c4cc96e9f3f12f1260355dd123b Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-02-14 14:38:53 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-02-14 15:06:47 +0000 sleepqueue: Address a lock order reversal After commit 74cf7cae4d22 ("softclock: Use dedicated ithreads for running callouts."), there is a lock order reversal between the per-CPU callout lock and the scheduler lock. softclock_thread() locks callout lock then the scheduler lock, when preparing to switch off-CPU, and sleepq_remove_thread() stops the timed sleep callout while potentially holding a scheduler lock. In the latter case, it's the thread itself that's locked, and if the thread is sleeping then its lock will be a sleepqueue lock, but if it's still in the process of going to sleep it'll be a scheduler lock. We could perhaps change softclock_thread() to try to acquire locks in the opposite order, but that'd require dropping and re-acquiring the callout lock, which seems expensive for an operation that will happen quite frequently. We can instead perhaps avoid stopping the td_slpcallout callout if the thread is still going to sleep, which is what this patch does. This will result in a spurious call to sleepq_timeout(), but some counters suggest that this is very rare. PR: 261198 Fixes: 74cf7cae4d22 ("softclock: Use dedicated ithreads for running callouts.") Reported and tested by: thj Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34204 --- sys/kern/subr_sleepqueue.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/sys/kern/subr_sleepqueue.c b/sys/kern/subr_sleepqueue.c index 36832ef96ba4..af5a001b46fb 100644 --- a/sys/kern/subr_sleepqueue.c +++ b/sys/kern/subr_sleepqueue.c @@ -833,7 +833,8 @@ sleepq_remove_thread(struct sleepqueue *sq, struct thread *td) td->td_sleepqueue = LIST_FIRST(&sq->sq_free); LIST_REMOVE(td->td_sleepqueue, sq_hash); - if ((td->td_flags & TDF_TIMEOUT) == 0 && td->td_sleeptimo != 0) + if ((td->td_flags & TDF_TIMEOUT) == 0 && td->td_sleeptimo != 0 && + td->td_lock == &sc->sc_lock) { /* * We ignore the situation where timeout subsystem was * unable to stop our callout. The struct thread is @@ -843,8 +844,16 @@ sleepq_remove_thread(struct sleepqueue *sq, struct thread *td) * sleepq_timeout() ensure that the thread does not * get spurious wakeups, even if the callout was reset * or thread reused. + * + * We also cannot safely stop the callout if a scheduler + * lock is held since softclock_thread() forces a lock + * order of callout lock -> scheduler lock. The thread + * lock will be a scheduler lock only if the thread is + * preparing to go to sleep, so this is hopefully a rare + * scenario. */ callout_stop(&td->td_slpcallout); + } td->td_wmesg = NULL; td->td_wchan = NULL;