From nobody Wed May 03 06:46:25 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QB6tM32nlz490v8; Wed, 3 May 2023 06:46:35 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4QB6tK554Dz4RWQ; Wed, 3 May 2023 06:46:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=none; spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.17.1/8.17.1) with ESMTPS id 3436kPoJ052568 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Wed, 3 May 2023 09:46:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 3436kPoJ052568 Received: (from kostik@localhost) by tom.home (8.17.1/8.17.1/Submit) id 3436kPEN052567; Wed, 3 May 2023 09:46:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 3 May 2023 09:46:25 +0300 From: Konstantin Belousov To: Kyle Evans Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 5844bd058aed - main - jobc: rework detection of orphaned groups. Message-ID: References: <202101100241.10A2fvtm057663@gitrepo.freebsd.org> List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.0 X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on tom.home X-Spamd-Result: default: False [-1.17 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; NEURAL_SPAM_SHORT(0.83)[0.833]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none]; MLMMJ_DEST(0.00)[dev-commits-src-all@freebsd.org,dev-commits-src-main@freebsd.org]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; R_DKIM_NA(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_SOFTFAIL(0.00)[~all:c]; HAS_XAW(0.00)[]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 4QB6tK554Dz4RWQ X-Spamd-Bar: - X-ThisMailContainsUnwantedMimeParts: N On Sun, Apr 30, 2023 at 11:01:44PM -0500, Kyle Evans wrote: > On Sat, Jan 9, 2021 at 8:43 PM Konstantin Belousov wrote: > > > > The branch main has been updated by kib: > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=5844bd058aed6f3d0c8cbbddd6aa95993ece0189 > > > > commit 5844bd058aed6f3d0c8cbbddd6aa95993ece0189 > > Author: Konstantin Belousov > > AuthorDate: 2020-12-29 00:41:56 +0000 > > Commit: Konstantin Belousov > > CommitDate: 2021-01-10 02:41:20 +0000 > > > > jobc: rework detection of orphaned groups. > > > > Instead of trying to maintain pg_jobc counter on each process group > > update (and sometimes before), just calculate the counter when needed. > > Still, for the benefit of the signal delivery code, explicitly mark > > orphaned groups as such with the new process group flag. > > > > This way we prevent bugs in the corner cases where updates to the counter > > were missed due to complicated configuration of p_pptr/p_opptr/real_parent > > (debugger). > > > > Since we need to iterate over all children of the process on exit, this > > change mostly affects the process group entry and leave, where we need > > to iterate all process group members to detect orpaned status. > > > > Hi, > > I have a question about the locking here... > > > [... snip ...] > > @@ -678,43 +677,40 @@ jobc_reaper(struct proc *p) > > } > > > > static struct proc * > > -jobc_parent(struct proc *p) > > +jobc_parent(struct proc *p, struct proc *p_exiting) > > { > > struct proc *pp; > > > > - sx_assert(&proctree_lock, SX_LOCKED); > > + sx_assert(&proctree_lock, SA_LOCKED); > > > > pp = proc_realparent(p); > > - if (pp->p_pptr == NULL || > > + if (pp->p_pptr == NULL || pp == p_exiting || > > (pp->p_treeflag & P_TREE_GRPEXITED) == 0) > > return (pp); > > return (jobc_reaper(pp)); > > } > > > > -#ifdef INVARIANTS > > -static void > > -check_pgrp_jobc(struct pgrp *pgrp) > > +static int > > +pgrp_calc_jobc(struct pgrp *pgrp) > > { > > struct proc *q; > > int cnt; > > > > - sx_assert(&proctree_lock, SX_LOCKED); > > - PGRP_LOCK_ASSERT(pgrp, MA_NOTOWNED); > > +#ifdef INVARIANTS > > + if (!mtx_owned(&pgrp->pg_mtx)) > > + sx_assert(&proctree_lock, SA_LOCKED); > > +#endif > > > > cnt = 0; > > - PGRP_LOCK(pgrp); > > LIST_FOREACH(q, &pgrp->pg_members, p_pglist) { > > if ((q->p_treeflag & P_TREE_GRPEXITED) != 0 || > > q->p_pptr == NULL) > > continue; > > - if (isjobproc(jobc_parent(q), pgrp)) > > + if (isjobproc(jobc_parent(q, NULL), pgrp)) > > cnt++; > > } > > - KASSERT(pgrp->pg_jobc == cnt, ("pgrp %d %p pg_jobc %d cnt %d", > > - pgrp->pg_id, pgrp, pgrp->pg_jobc, cnt)); > > - PGRP_UNLOCK(pgrp); > > + return (cnt); > > } > > -#endif > > > > /* > > * Move p to a process group > > [... snip ...] > > So the proctree lock is sufficient for pgrp_calc_jobc() to provide a > stable answer needed for everything fixjobc_kill() uses it for... > > > @@ -934,35 +827,46 @@ fixjobc_kill(struct proc *p) > > p->p_treeflag |= P_TREE_GRPEXITED; > > > > /* > > - * Check p's parent to see whether p qualifies its own process > > - * group; if so, adjust count for p's process group. > > + * Check if exiting p orphans its own group. > > */ > > - if (isjobproc(jobc_parent(p), pgrp)) > > - pgadjustjobc(pgrp, false); > > + pgrp = p->p_pgrp; > > + if (isjobproc(jobc_parent(p, NULL), pgrp)) { > > + PGRP_LOCK(pgrp); > > + if (pgrp_calc_jobc(pgrp) == 0) > > + orphanpg(pgrp); > > + PGRP_UNLOCK(pgrp); > > + } > > > > /* > > * Check this process' children to see whether they qualify > > - * their process groups after reparenting to reaper. If so, > > - * adjust counts for children's process groups. > > + * their process groups after reparenting to reaper. > > */ > > LIST_FOREACH(q, &p->p_children, p_sibling) { > > - if ((q->p_treeflag & P_TREE_ORPHANED) != 0) > > - continue; > > - fixjobc_kill_q(p, q, true); > > + pgrp = q->p_pgrp; > > + PGRP_LOCK(pgrp); > > + if (pgrp_calc_jobc(pgrp) == 0) { > > + /* > > + * We want to handle exactly the children that > > + * has p as realparent. Then, when calculating > > + * jobc_parent for children, we should ignore > > + * P_TREE_GRPEXITED flag already set on p. > > + */ > > + if (jobc_parent(q, p) == p && isjobproc(p, pgrp)) > > + orphanpg(pgrp); > > + } else > > + pgrp->pg_flags &= ~PGRP_ORPHANED; > > + PGRP_UNLOCK(pgrp); > > } > > - LIST_FOREACH(q, &p->p_orphans, p_orphan) > > - fixjobc_kill_q(p, q, true); > > - LIST_FOREACH(q, &p->p_children, p_sibling) { > > - if ((q->p_treeflag & P_TREE_ORPHANED) != 0) > > - continue; > > - fixjobc_kill_q(p, q, false); > > + LIST_FOREACH(q, &p->p_orphans, p_orphan) { > > + pgrp = q->p_pgrp; > > + PGRP_LOCK(pgrp); > > + if (pgrp_calc_jobc(pgrp) == 0) { > > + if (isjobproc(p, pgrp)) > > + orphanpg(pgrp); > > + } else > > + pgrp->pg_flags &= ~PGRP_ORPHANED; > > + PGRP_UNLOCK(pgrp); > > } > > - LIST_FOREACH(q, &p->p_orphans, p_orphan) > > - fixjobc_kill_q(p, q, false); > > - > > -#ifdef INVARIANTS > > - check_pgrp_jobc(pgrp); > > -#endif > > } > > > > void > > ... and I would imagine the proctree lock is sufficient for > isjobproc/jobc_parent as well- is there any reason we can't/shouldn't > just use atomic(9) for operations with pgrp->pg_flags and push all of > the extra lock acquisitions into the orphanpg() cases? It seems like > we could avoid taking any pgrp lock in the common case and at least > mitigate that additional overhead from the exit() path. My reading is that you are proposing change pgrp locks into proctree_lock. Flag (PGRP_ORPHANED) needs to be consistent with the group membership, so it cannot be atomic, it would need to be under the same lock. If this works, I do not object, but I did not looked carefully. The most obvious problem could be that it is not easy to take sx proctree_lock in some places that only need pgrp mutex right now.