[PATCH 2/2] fork: plug a use after free of the returned process pointer

Thu Feb 4 09:35:20 UTC 2016

On Wed, Feb 03, 2016 at 04:13:29PM +0200, Konstantin Belousov wrote:
> On Wed, Feb 03, 2016 at 09:05:15AM +0100, Mateusz Guzik wrote:
> > > > CPU0 is executing fork1. p2 is not traced.
> > > > 
> > > > CPU0				CPU1
> > > > p2 and td2 created
> > > > td2 is marked runnable
> > > > 				td2 is scheduled here
> > > > 				td2 does not have TDB_STOPATFORK set
> > > > 				td2 exits
> > > > 				p2 is autoreaped
> > > > 				td2's space is reused
> > > > 				td2 gets linked into p3
> > > > 				td2 gets TDB_STOPATFORK
> > > > PROC_LOCK(p2);
> > > > TDB_STOPATFORK test on td2
> > > > cv_wait(&p2->p_dbgwait, ..);
> > > > 
> > > > So at this point p2 has no linked threads and is free. td2 belongs to
> > > > p3 and p2 is waiting for a wakeup which can't happen.
> I am convinced about this.  I thought that the fork_return() guarantee
> that the child cannot exit if TDB_STOPATFORK is set is enough, but the
> issue is other way around.
> 
> > > > 
> > > > Now that I look at it this may be broken in an additonal way, which is
> > > > not fixed by the patch: what if td2 spawns a new thread and thr_exits?
> > > > In this case testing td2 is still invalid.  Maybe I'm just getting
> > > > paranoid here, I don't have time to properly test this right now. In
> > > > worst case should be fixable well enough with FIRST_THREAD_IN_PROC.
> > > > 

I committed previous 2 patches.

Stuff below is just speculation.

So the remaining problem, after we know the process has to survive, is
survival of the thread and its relationship with the process.

The problem stems from not having the proc lock over the entire time
from the moment the thread is marked as runnable to the moment where the
code is done with it.

Race 1:

CPU0				CPU1
p1: p2 and td2 created
td2: marked runnable
				td2: scheduled here
				td2: does not have TDB_STOPATFORK set
				td2: calls thr_new
				td2: calls thr_exit
				td2: reused and linked into p3
				td2: gets TDB_STOPATFORK
p1: PROC_LOCK(p2);
p1: TDB_STOPATFORK test on td2
p1: cv_wait(&p2->p_dbgwait, ..);

p2 is the process we want, but td2 now belongs to a different thread.

Race 2:

However, seems to be even more buggy. To quote:

	while ((td2->td_dbgflags & TDB_STOPATFORK) != 0)
		cv_wait(&p2->p_dbgwait, &p2->p_mtx);

The check is done in a loop which drops the proc lock. This makes me
wonder about the following additional race:

p2 is traced, TDB_STOPATFORK is set on td2.

CPU0				CPU1
p1: PROC_LOCK(p2);
p1: TDB_STOPATFORK test on td2
p1: cv_wait(&p2->p_dbgwait, ..);
				td2: is scheduled here
				td2: clears TDB_STOPATFORK
				td2: cv_broadcast(&p2->p_dbgwait)
p1: not scheduled yet
				td2: calls thr_new
				td2: calls thr_exit
				td2: is reused and linked into p3
				td2: gets TDB_STOPATFORK
p1: scheduled here
p1: internal PROC_LOCK(p2); 
p1: TDB_STOPATFORK test on td2

But td2 now belongs to p3.

I think the patch below deals with race 1 just fine.

For race 2, it is unclear to me if the while loop is justified. If a
single 'if' statement was sufficient, there would be no problem since
unlock + lock would be avoided guaranteeting the consistency.

I was pondering borrowing fork_return's logic to check if tracing is
enabled before testing TDB_STOPATFORK. However, tracing state could have
changed several times invalidating the result. Maybe refreshing the
pointer to th first thread would do the trick, but imho the lock
dropping business is extremely fishy and will have to be dealt with at
some point.

I have other stuff I want to do before 11.0, so I may drop this for the
time being.

The patch for race 1:

diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c
index baee954..1fea651 100644
--- a/sys/kern/kern_fork.c
+++ b/sys/kern/kern_fork.c
@@ -717,11 +717,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread *
 	if ((fr->fr_flags & RFMEM) == 0 && dtrace_fasttrap_fork)
 		dtrace_fasttrap_fork(p1, p2);
 #endif
-	/*
-	 * Hold the process so that it cannot exit after we make it runnable,
-	 * but before we wait for the debugger.
-	 */
-	_PHOLD(p2);
 	if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED |
 	    P_FOLLOWFORK)) {
 		/*
@@ -758,6 +753,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread *
 		fdrop(fp_procdesc, td);
 	}
 
+	PROC_LOCK(p2);
 	if ((fr->fr_flags & RFSTOPPED) == 0) {
 		/*
 		 * If RFSTOPPED not requested, make child runnable and
@@ -773,13 +769,11 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread *
 		*fr->fr_procp = p2;
 	}
 
-	PROC_LOCK(p2);
 	/*
 	 * Wait until debugger is attached to child.
 	 */
 	while ((td2->td_dbgflags & TDB_STOPATFORK) != 0)
 		cv_wait(&p2->p_dbgwait, &p2->p_mtx);
-	_PRELE(p2);
 	racct_proc_fork_done(p2);
 	PROC_UNLOCK(p2);
 }

-- 
Mateusz Guzik <mjguzik gmail.com>