Re: git: 5844bd058aed - main - jobc: rework detection of orphaned groups.
Date: Mon, 01 May 2023 04:01:44 UTC
On Sat, Jan 9, 2021 at 8:43 PM Konstantin Belousov <kib@freebsd.org> wrote: > > The branch main has been updated by kib: > > URL: https://cgit.FreeBSD.org/src/commit/?id=5844bd058aed6f3d0c8cbbddd6aa95993ece0189 > > commit 5844bd058aed6f3d0c8cbbddd6aa95993ece0189 > Author: Konstantin Belousov <kib@FreeBSD.org> > AuthorDate: 2020-12-29 00:41:56 +0000 > Commit: Konstantin Belousov <kib@FreeBSD.org> > CommitDate: 2021-01-10 02:41:20 +0000 > > jobc: rework detection of orphaned groups. > > Instead of trying to maintain pg_jobc counter on each process group > update (and sometimes before), just calculate the counter when needed. > Still, for the benefit of the signal delivery code, explicitly mark > orphaned groups as such with the new process group flag. > > This way we prevent bugs in the corner cases where updates to the counter > were missed due to complicated configuration of p_pptr/p_opptr/real_parent > (debugger). > > Since we need to iterate over all children of the process on exit, this > change mostly affects the process group entry and leave, where we need > to iterate all process group members to detect orpaned status. > Hi, I have a question about the locking here... > [... snip ...] > @@ -678,43 +677,40 @@ jobc_reaper(struct proc *p) > } > > static struct proc * > -jobc_parent(struct proc *p) > +jobc_parent(struct proc *p, struct proc *p_exiting) > { > struct proc *pp; > > - sx_assert(&proctree_lock, SX_LOCKED); > + sx_assert(&proctree_lock, SA_LOCKED); > > pp = proc_realparent(p); > - if (pp->p_pptr == NULL || > + if (pp->p_pptr == NULL || pp == p_exiting || > (pp->p_treeflag & P_TREE_GRPEXITED) == 0) > return (pp); > return (jobc_reaper(pp)); > } > > -#ifdef INVARIANTS > -static void > -check_pgrp_jobc(struct pgrp *pgrp) > +static int > +pgrp_calc_jobc(struct pgrp *pgrp) > { > struct proc *q; > int cnt; > > - sx_assert(&proctree_lock, SX_LOCKED); > - PGRP_LOCK_ASSERT(pgrp, MA_NOTOWNED); > +#ifdef INVARIANTS > + if (!mtx_owned(&pgrp->pg_mtx)) > + sx_assert(&proctree_lock, SA_LOCKED); > +#endif > > cnt = 0; > - PGRP_LOCK(pgrp); > LIST_FOREACH(q, &pgrp->pg_members, p_pglist) { > if ((q->p_treeflag & P_TREE_GRPEXITED) != 0 || > q->p_pptr == NULL) > continue; > - if (isjobproc(jobc_parent(q), pgrp)) > + if (isjobproc(jobc_parent(q, NULL), pgrp)) > cnt++; > } > - KASSERT(pgrp->pg_jobc == cnt, ("pgrp %d %p pg_jobc %d cnt %d", > - pgrp->pg_id, pgrp, pgrp->pg_jobc, cnt)); > - PGRP_UNLOCK(pgrp); > + return (cnt); > } > -#endif > > /* > * Move p to a process group > [... snip ...] So the proctree lock is sufficient for pgrp_calc_jobc() to provide a stable answer needed for everything fixjobc_kill() uses it for... > @@ -934,35 +827,46 @@ fixjobc_kill(struct proc *p) > p->p_treeflag |= P_TREE_GRPEXITED; > > /* > - * Check p's parent to see whether p qualifies its own process > - * group; if so, adjust count for p's process group. > + * Check if exiting p orphans its own group. > */ > - if (isjobproc(jobc_parent(p), pgrp)) > - pgadjustjobc(pgrp, false); > + pgrp = p->p_pgrp; > + if (isjobproc(jobc_parent(p, NULL), pgrp)) { > + PGRP_LOCK(pgrp); > + if (pgrp_calc_jobc(pgrp) == 0) > + orphanpg(pgrp); > + PGRP_UNLOCK(pgrp); > + } > > /* > * Check this process' children to see whether they qualify > - * their process groups after reparenting to reaper. If so, > - * adjust counts for children's process groups. > + * their process groups after reparenting to reaper. > */ > LIST_FOREACH(q, &p->p_children, p_sibling) { > - if ((q->p_treeflag & P_TREE_ORPHANED) != 0) > - continue; > - fixjobc_kill_q(p, q, true); > + pgrp = q->p_pgrp; > + PGRP_LOCK(pgrp); > + if (pgrp_calc_jobc(pgrp) == 0) { > + /* > + * We want to handle exactly the children that > + * has p as realparent. Then, when calculating > + * jobc_parent for children, we should ignore > + * P_TREE_GRPEXITED flag already set on p. > + */ > + if (jobc_parent(q, p) == p && isjobproc(p, pgrp)) > + orphanpg(pgrp); > + } else > + pgrp->pg_flags &= ~PGRP_ORPHANED; > + PGRP_UNLOCK(pgrp); > } > - LIST_FOREACH(q, &p->p_orphans, p_orphan) > - fixjobc_kill_q(p, q, true); > - LIST_FOREACH(q, &p->p_children, p_sibling) { > - if ((q->p_treeflag & P_TREE_ORPHANED) != 0) > - continue; > - fixjobc_kill_q(p, q, false); > + LIST_FOREACH(q, &p->p_orphans, p_orphan) { > + pgrp = q->p_pgrp; > + PGRP_LOCK(pgrp); > + if (pgrp_calc_jobc(pgrp) == 0) { > + if (isjobproc(p, pgrp)) > + orphanpg(pgrp); > + } else > + pgrp->pg_flags &= ~PGRP_ORPHANED; > + PGRP_UNLOCK(pgrp); > } > - LIST_FOREACH(q, &p->p_orphans, p_orphan) > - fixjobc_kill_q(p, q, false); > - > -#ifdef INVARIANTS > - check_pgrp_jobc(pgrp); > -#endif > } > > void ... and I would imagine the proctree lock is sufficient for isjobproc/jobc_parent as well- is there any reason we can't/shouldn't just use atomic(9) for operations with pgrp->pg_flags and push all of the extra lock acquisitions into the orphanpg() cases? It seems like we could avoid taking any pgrp lock in the common case and at least mitigate that additional overhead from the exit() path. Thanks, Kyle Evans