Re: git: 12fb39ec3e6b - main - proc: Relax proc_rwmem()'s assertion on the process hold count

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Wed, 06 Apr 2022 14:38:04 UTC
On Wed, Apr 06, 2022 at 10:24:18AM -0400, Mark Johnston wrote:
> On Tue, Apr 05, 2022 at 02:03:14AM +0300, Konstantin Belousov wrote:
> > On Tue, Apr 05, 2022 at 02:01:04AM +0300, Konstantin Belousov wrote:
> > > On Tue, Mar 01, 2022 at 05:41:55PM +0000, Mark Johnston wrote:
> > > > The branch main has been updated by markj:
> > > > 
> > > > URL: https://cgit.FreeBSD.org/src/commit/?id=12fb39ec3e6bc529feff3ba2862c6a4a30bd54eb
> > > > 
> > > > commit 12fb39ec3e6bc529feff3ba2862c6a4a30bd54eb
> > > > Author:     Mark Johnston <markj@FreeBSD.org>
> > > > AuthorDate: 2022-03-01 16:48:39 +0000
> > > > Commit:     Mark Johnston <markj@FreeBSD.org>
> > > > CommitDate: 2022-03-01 17:40:35 +0000
> > > > 
> > > >     proc: Relax proc_rwmem()'s assertion on the process hold count
> > > >     
> > > >     This reference ensures that the process and its associated vmspace will
> > > >     not be destroyed while proc_rwmem() is executing.  If, however, the
> > > >     calling thread belongs to the target process, then it is unnecessary to
> > > >     hold the process.  In particular, fasttrap - a module which enables
> > > >     userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
> > > >     avoid the overhead of locking and bumping the hold count when possible.
> > > In fact I am not sure it makes much sense to disable swap out for remote
> > > process as well.  With the current definition of p_hold, it only prevents
> > > kstack pages reuse, which should be irrelevant for proc_rwmem().
> > 
> > What probably should be done is referencing the target process vmspace,
> > instead.
> 
> You mean, callers should use something like this, with the proc locked:
> 
> 	bool
> 	vmspace_reference_live(struct vmspace *vm)
> 	{
> 		return (refcount_acquire_if_not_zero(&vm->vm_refcnt));
> 	}
> 
> ?
You use conditional acquire to prevent reviving the dying vmspace?
IMO it is unneeded complication, the dance in vmspace_exit() should
be enough to avoid this, but I do not object.

> 
> Yes, I think that'd be sufficient.  Though, in practice most callers of
> proc_rwmem() really do need the proc to be held for other purposes, so
> the existing assertion serves as a close enough approximation.
By in practice, you mean that the typical caller is through the
kern_ptrace() which ensures the hold, right?

> 
> We could maybe just relax the assertion in proc_rwmem() to
> 
> 	MPASS(refcount_load(&p->p_vmspace->vm_refcnt) > 0);
> 
> since this is implied by p_hold > 0.
May be.

I looked at all callers of the proc_rwmem(), and I think that e.g. the
cuse use of PROC_HOLD is gratitious. Either the target process must be
prevented from reuse by other means, or PROC_HOLD() does nothing and
effectively just hides a 'security' hole. If the process must be kept
from recycling, then this hold only function is to provide two excess
mtx_lock/unlock calls.

P.S. I do not 'dislike' your commit, I am more about the fact that the
use of p_hold has completely changed from what it was used for when we
had swappable process user area.

> 
> > > >     Thus, make the assertion conditional on "p != curproc".  Also assert
> > > >     that the process is not already exiting.  No functional change intended.
> > > >     
> > > >     MFC after:      2 weeks
> > > >     Sponsored by:   The FreeBSD Foundation
> > > > ---
> > > >  sys/kern/sys_process.c | 9 +++++----
> > > >  1 file changed, 5 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/sys/kern/sys_process.c b/sys/kern/sys_process.c
> > > > index 582bff962f1a..8d8c5a1d34ff 100644
> > > > --- a/sys/kern/sys_process.c
> > > > +++ b/sys/kern/sys_process.c
> > > > @@ -336,11 +336,12 @@ proc_rwmem(struct proc *p, struct uio *uio)
> > > >  	int error, fault_flags, page_offset, writing;
> > > >  
> > > >  	/*
> > > > -	 * Assert that someone has locked this vmspace.  (Should be
> > > > -	 * curthread but we can't assert that.)  This keeps the process
> > > > -	 * from exiting out from under us until this operation completes.
> > > > +	 * Make sure that the process' vmspace remains live.
> > > >  	 */
> > > > -	PROC_ASSERT_HELD(p);
> > > > +	if (p != curproc)
> > > > +		PROC_ASSERT_HELD(p);
> > > > +	KASSERT((p->p_flag & P_WEXIT) == 0,
> > > > +	    ("%s: process %p is exiting", __func__, p));
> > > >  	PROC_LOCK_ASSERT(p, MA_NOTOWNED);
> > > >  
> > > >  	/*