cpuset_setithread and PROC_LOCK vs. thread_lock usage: Is this correct?
Konstantin Belousov
kostikbel at gmail.com
Tue May 30 09:20:21 UTC 2017
On Mon, May 29, 2017 at 06:48:28PM -0700, Mark Millard wrote:
> [FYI: The source referenced is from head
> -r317820.]
>
> In the process of trying to get evidence
> related to a periodic/random kernel panic
> for TARGET_ARCH=powerpc I ran into what
> follows. I've no evidence that it is tied
> to the panics. It is just something that
> I read that looked a little odd to me.
>
> But it looked like a good thing to ask
> about, at least based on my level of
> ignorance of such things for the kernel.
> (The more I learn the better.)
>
> I normally see the likes of:
>
> thread_lock(td);
> tdset = td->td_cpuset;
>
> and:
>
> thread_lock(td);
> error = cpuset_shadow(td->td_cpuset, nset, mask);
>
> but in cpuset_setithread the comments read like
> a PROC_LOCK is in use instead:
>
> int
> cpuset_setithread(lwpid_t id, int cpu)
> {
> . . .
> struct proc *p;
> . . .
> error = cpuset_which(CPU_WHICH_TID, id, &p, &td, &old_set);
> if (error != 0 || ((cs_id = alloc_unr(cpuset_unr)) == CPUSET_INVALID))
> goto out;
>
> /* cpuset_which() returns with PROC_LOCK held. */
> old_set = td->td_cpuset;
> . . .
>
> That seems true for !error but is a different
> lock than used elsewhere (those thread_lock's).
>
> cpuset_which does seem to end up with the PROC_LOCK
> as indicated:
>
> int
> cpuset_which(cpuwhich_t which, id_t id, struct proc **pp, struct thread **tdp,
> struct cpuset **setp)
> {
> . . .
> case CPU_WHICH_TID:
> if (id == -1) {
> PROC_LOCK(curproc);
> p = curproc;
> td = curthread;
> break;
> }
> td = tdfind(id, -1);
> if (td == NULL)
> return (ESRCH);
> p = td->td_proc;
> break;
> . . .
> error = p_cansched(curthread, p);
> if (error) {
> PROC_UNLOCK(p);
> return (error);
> }
> if (td == NULL)
> td = FIRST_THREAD_IN_PROC(p);
> *pp = p;
> *tdp = td;
> return (0);
> }
>
> (tdfind does rw_rlock on tidhash_lock,
> PROC_LOCK on td->td_proc. It only
> PROC_UNLOCK's for PRS_NEW and
> is comments to return with the proc
> lock held. No thread_lock's involved.)
>
> So it appears to me that in cpuset_setithread :
>
> /* cpuset_which() returns with PROC_LOCK held. */
> old_set = td->td_cpuset;
>
> might happen with no thread_lock protecting the
> td use (say to avoid reading during updates). (This
> might also apply to what old_set points to.)
>
> Similarly for:
>
> cpuset_setithread(lwpid_t id, int cpu)
> {
> . . .
> struct cpuset *parent, *old_set;
> . . .
> error = cpuset_shadow(parent, nset, &mask);
> . . .
>
> it does not appear that a thread_lock is involved
> around the cpuset_shadow operation, unlike
> elsewhere.
>
> Is this lack of thread_lock involvement in fact
> okay for some reason in each case?
This is a comment noting the lock which is held there. The pattern where
some function is entered unlocked and returns with some lock held is not
uncommon but also not used too often, so the note to reader helps.
On the other hand, the note does not imply that the action of reading
td_cpuset is protected by the proc lock, it happens accidently under it.
Read of the pointer is atomic on all supported architectures, so taking
the thread lock around the read is useless, unless some consistency
between the pointer value and some other values are needed.
The thread lock only protects updates to the td_cpuset thread member,
it does not protect the dereferenced structure. Look for the comment in
sys/cpuset.h for explanation of the locking model for struct cpuset.
In this case, I suspect that what is needed is a ref count on the
dereferenced cpuset, at least it seems so according to the locking
annotations on struct cpuset.
But if the td_cpuset for ithread is only manipulated by the
cpuset_setithread(), then the process lock exclusion might provide the
sufficient protection against parallel updates (preventing leaks), and
even more important, prevents sudden deconstruction of old_set while
cpuset_setithread() uses it to set a new one.
More information about the freebsd-ppc
mailing list