Re: git: 69413598d266 - main - signal: use proc_iterate to save on work

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Wed, 07 Sep 2022 09:36:03 UTC
On 9/7/22, Konstantin Belousov <kostikbel@gmail.com> wrote:
> On Mon, Sep 05, 2022 at 06:07:08PM +0200, Mateusz Guzik wrote:
>> On 9/5/22, Konstantin Belousov <kostikbel@gmail.com> wrote:
>> > On Mon, Sep 05, 2022 at 11:56:15AM +0000, Mateusz Guzik wrote:
>> >> The branch main has been updated by mjg:
>> >>
>> >> URL:
>> >> https://cgit.FreeBSD.org/src/commit/?id=69413598d2660054e29cac9454fe18c08e3bf36d
>> >>
>> >> commit 69413598d2660054e29cac9454fe18c08e3bf36d
>> >> Author:     Mateusz Guzik <mjg@FreeBSD.org>
>> >> AuthorDate: 2022-03-10 18:58:12 +0000
>> >> Commit:     Mateusz Guzik <mjg@FreeBSD.org>
>> >> CommitDate: 2022-09-05 11:54:47 +0000
>> >>
>> >>     signal: use proc_iterate to save on work
>> >>
>> >>     Most notably poudriere performs kill -9 -1 in jails for each port
>> >>     being built. This reduces the scan from hundrends of processes to
>> >>     literally 1.
>> >>
>> >>     Reviewed by:    jamie, markj
>> >>     Differential Revision:  https://reviews.freebsd.org/D34522
>> >> ---
>> >>  sys/kern/kern_sig.c | 39 ++++++++++++++++++++++++++++-----------
>> >>  1 file changed, 28 insertions(+), 11 deletions(-)
>> >>
>> >> diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c
>> >> index 6fd3eca0a14e..c50a37de07e6 100644
>> >> --- a/sys/kern/kern_sig.c
>> >> +++ b/sys/kern/kern_sig.c
>> >> @@ -1776,18 +1776,13 @@ struct killpg1_ctx {
>> >>  };
>> >>
>> >>  static void
>> >> -killpg1_sendsig(struct proc *p, bool notself, struct killpg1_ctx
>> >> *arg)
>> >> +killpg1_sendsig_locked(struct proc *p, struct killpg1_ctx *arg)
>> >>  {
>> >>  	int err;
>> >>
>> >> -	if (p->p_pid <= 1 || (p->p_flag & P_SYSTEM) != 0 ||
>> >> -	    (notself && p == arg->td->td_proc) || p->p_state == PRS_NEW)
>> >> -		return;
>> >> -	PROC_LOCK(p);
>> >>  	err = p_cansignal(arg->td, p, arg->sig);
>> >>  	if (err == 0 && arg->sig != 0)
>> >>  		pksignal(p, arg->sig, arg->ksi);
>> >> -	PROC_UNLOCK(p);
>> >>  	if (err != ESRCH)
>> >>  		arg->found = true;
>> >>  	if (err == 0)
>> >> @@ -1796,6 +1791,31 @@ killpg1_sendsig(struct proc *p, bool notself,
>> >> struct killpg1_ctx *arg)
>> >>  		arg->ret = err;
>> >>  }
>> >>
>> >> +static void
>> >> +killpg1_sendsig(struct proc *p, bool notself, struct killpg1_ctx
>> >> *arg)
>> >> +{
>> >> +
>> >> +	if (p->p_pid <= 1 || (p->p_flag & P_SYSTEM) != 0 ||
>> >> +	    (notself && p == arg->td->td_proc) || p->p_state == PRS_NEW)
>> >> +		return;
>> >> +
>> >> +	PROC_LOCK(p);
>> >> +	killpg1_sendsig_locked(p, arg);
>> >> +	PROC_UNLOCK(p);
>> >> +}
>> >> +
>> >> +static void
>> >> +kill_processes_prison_cb(struct proc *p, void *arg)
>> >> +{
>> >> +	struct killpg1_ctx *ctx = arg;
>> >> +
>> >> +	if (p->p_pid <= 1 || (p->p_flag & P_SYSTEM) != 0 ||
>> >> +	    (p == ctx->td->td_proc) || p->p_state == PRS_NEW)
>> > Extra ()
>> >
>> >> +		return;
>> >> +
>> >> +	killpg1_sendsig_locked(p, ctx);
>> >> +}
>> >> +
>> >>  /*
>> >>   * Common code for kill process group/broadcast kill.
>> >>   * cp is calling process.
>> >> @@ -1817,11 +1837,8 @@ killpg1(struct thread *td, int sig, int pgid,
>> >> int
>> >> all, ksiginfo_t *ksi)
>> >>  		/*
>> >>  		 * broadcast
>> >>  		 */
>> >> -		sx_slock(&allproc_lock);
>> >> -		FOREACH_PROC_IN_SYSTEM(p) {
>> >> -			killpg1_sendsig(p, true, &arg);
>> >> -		}
>> >> -		sx_sunlock(&allproc_lock);
>> >> +		prison_proc_iterate(td->td_ucred->cr_prison,
>> >> +		    kill_processes_prison_cb, &arg);
>> >>  	} else {
>> >>  		sx_slock(&proctree_lock);
>> >>  		if (pgid == 0) {
>> >
>> > I believe before your change, kill(-1) would kill all processes in the
>> > jail, which includes all processes in the nested jails.  Now, it seems
>> > that linkage prevents iterating over the nested jails, am I missing it?
>> >
>>
>> See the pr_childcount check. If any jails pop up, there is a full scan
>> like right now. And if the count transitions 0 -> 1 -> 0 during the
>> iteration of the loop, there is nobody left to signal.
> I see.
>
>>
>> All previously existing races remain unaffected.
> No, now you are potentially signalling some processes more than once.
> Sending the signal is not idempotent operation.
>

You have a point, I was pretty fixated on just kill -9 -1.

I'm going going to have to ponder a little bit, perhaps it is time to
finally fix the races.

-- 
Mateusz Guzik <mjguzik gmail.com>