cam_periph, and locking?

Matthew Jacob mj at feral.com
Sat Apr 10 16:30:48 UTC 2010


This subject seems to have petered out a bit.....

Where are we on the locking for the list? I personally like Alexander's 
unit_lock change.

On my own front, some work priorities shifted, so I haven't (yet) 
finished a lot of the test to destruction stuff, but I have made some 
findings and found some (partial, incomplete) remedies.

Here are my notes from the other day on this. Bear with me on this- they 
aren't the most polished, it's WIP. Comments welcome.


A) Four basic problems

+ Periph invalidation can occur after a periph_find. Not all calls are 
protected by a sim lock.

+ The probe state machine can (sometimes) continue despite a failure 
that caused a periph invalidation

+ Some of the periph driver callbacks (dasysctlinit, some side effects 
of disk_create) are not cognizant of periph invalidation and blindly use 
pointers, etc.

+ periph invalidation *during* probe can lead to reference after free or 
bad reference (panics)

Note that some of this stuff is not really affected by locking.

(minor addendum- cam_periph_release_locked can cause the ref count to go 
negative)

B) Remedies

=> periph_find bumps a refcount (this has obvious MFC and other 
implications, as you have to have the caller remember to release)

=> the probe periph driver should do a periph_hold so that the periph 
doesn't disappear until the periph driver explicitly unholds it

=> periph drivers can't use callbacks that just have pointers to an 
unheld periph structure.

With these changes in place, my simulated unit test ran much better- 
still ended up with a bug where cam_periph_runccb never came back, but 
at least I wasn't stuck in panics and ref's after free instantly like I 
was before.



More information about the freebsd-scsi mailing list