cam_periph, and locking?
Matthew Jacob
mj at feral.com
Sat Apr 10 16:30:48 UTC 2010
This subject seems to have petered out a bit.....
Where are we on the locking for the list? I personally like Alexander's
unit_lock change.
On my own front, some work priorities shifted, so I haven't (yet)
finished a lot of the test to destruction stuff, but I have made some
findings and found some (partial, incomplete) remedies.
Here are my notes from the other day on this. Bear with me on this- they
aren't the most polished, it's WIP. Comments welcome.
A) Four basic problems
+ Periph invalidation can occur after a periph_find. Not all calls are
protected by a sim lock.
+ The probe state machine can (sometimes) continue despite a failure
that caused a periph invalidation
+ Some of the periph driver callbacks (dasysctlinit, some side effects
of disk_create) are not cognizant of periph invalidation and blindly use
pointers, etc.
+ periph invalidation *during* probe can lead to reference after free or
bad reference (panics)
Note that some of this stuff is not really affected by locking.
(minor addendum- cam_periph_release_locked can cause the ref count to go
negative)
B) Remedies
=> periph_find bumps a refcount (this has obvious MFC and other
implications, as you have to have the caller remember to release)
=> the probe periph driver should do a periph_hold so that the periph
doesn't disappear until the periph driver explicitly unholds it
=> periph drivers can't use callbacks that just have pointers to an
unheld periph structure.
With these changes in place, my simulated unit test ran much better-
still ended up with a bug where cam_periph_runccb never came back, but
at least I wasn't stuck in panics and ref's after free instantly like I
was before.
More information about the freebsd-scsi
mailing list