svn commit: r355862 - stable/12/sys/cam/scsi
Kenneth D. Merry
ken at FreeBSD.org
Tue Dec 17 20:29:47 UTC 2019
Author: ken
Date: Tue Dec 17 20:29:47 2019
New Revision: 355862
URL: https://svnweb.freebsd.org/changeset/base/355862
Log:
MFC r355299:
------------------------------------------------------------------------
r355299 | ken | 2019-12-02 14:57:39 -0500 (Mon, 02 Dec 2019) | 52 lines
Fix a hang introduced in r351599.
My changes in 351599 (kindly committed by avg) made the cd(4) media check
asynchronous to avoid a sleep while holding a mutex.
There was a difficult to reproduce bug with those changes that caused a
hang on boot on some single processor machines/VMs. Leandro Lupori
managed to reproduce the bug, diagnose it, and supplied a patch! Here is
his analysis, from the PR:
======
I was able to reproduce the problem described in comment#14.
Actually, I wasn't trying to reproduce it, I just started seeing it a few
weeks ago, in CURRENT.
I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a
single core/thread (-smp 1).
It happens only when there is no media in the emulated CD-ROM, a device
that QEMU adds by default, unless -nodefaults is specified in command line.
I've debugged it and this is what I've found:
1- After the CD probe is successful, GEOM will try to open the device,
which will end up calling cdcheckmedia(), that sets CD state to
CD_STATE_MEDIA_PREVENT.
2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED
flag is set and CD state moves to CD_STATE_MEDIA_SIZE.
3- Next, scsi_read_capacity() is executed and fails, state is set to
CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up
cdcheckmedia().
4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it
first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to
CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering
an infinite MEDIA_SIZE command loop.
When there is a least another core/thread, the GEOM thread that performed
the initial cdopen() will get scheduled again, closing the CD device, that
will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and
breaks the loop.
So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when
CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the
state should be advanced to CD_STATE_MEDIA size only when the current state
is CD_STATE_MEDIA_PREVENT.
=====
------------------------------------------------------------------------
PR: kern/219857
Submitted by: Leandro Lupori <leandro.lupori at gmail.com>
Modified:
stable/12/sys/cam/scsi/scsi_cd.c
Directory Properties:
stable/12/ (props changed)
Modified: stable/12/sys/cam/scsi/scsi_cd.c
==============================================================================
--- stable/12/sys/cam/scsi/scsi_cd.c Tue Dec 17 19:01:09 2019 (r355861)
+++ stable/12/sys/cam/scsi/scsi_cd.c Tue Dec 17 20:29:47 2019 (r355862)
@@ -1030,7 +1030,8 @@ cdstart(struct cam_periph *periph, union ccb *start_cc
* If the CD is already locked, we don't need to do this.
* Move on to the capacity check.
*/
- if ((softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
+ if (softc->state == CD_STATE_MEDIA_PREVENT
+ && (softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
softc->state = CD_STATE_MEDIA_SIZE;
xpt_release_ccb(start_ccb);
xpt_schedule(periph, CAM_PRIORITY_NORMAL);
More information about the svn-src-stable-12
mailing list