git: e8955bd643ee - main - pipe: Use a distinct wait channel for I/O serialization

From: Mark Johnston <markj_at_FreeBSD.org>
Date: Tue, 14 Jun 2022 16:01:10 UTC
The branch main has been updated by markj:

URL: https://cgit.FreeBSD.org/src/commit/?id=e8955bd643ee852d70a0b065f2a0d1bb3fa99df2

commit e8955bd643ee852d70a0b065f2a0d1bb3fa99df2
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2022-06-14 14:52:03 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2022-06-14 16:00:59 +0000

    pipe: Use a distinct wait channel for I/O serialization
    
    Suppose a thread tries to read from an empty pipe.  pipe_read() does the
    following:
    
    1. pipelock(), possibly sleeping
    2. check for buffered data
    3. pipeunlock()
    4. set PIPE_WANTR and sleep
    5. goto 1
    
    pipelock() is an open-coded mutex; if a thread blocks in pipelock(), it
    sleeps until the lock holder calls pipeunlock().
    
    Both sleeps use the same wait channel.  So if there are multiple threads
    in pipe_read(), a thread T1 in step 3 can wake up a thread T2 sleeping
    in step 4.  Then T1 goes to sleep in step 4, and T2 acquires and
    releases the pipelock, waking up T1 again.  This can go on indefinitely,
    livelocking the process (and potentially starving a would-be writer).
    
    Fix the problem by using a separate wait channel for pipelock().
    
    Reported by:    Paul Floyd <paulf2718@gmail.com>
    Reviewed by:    mjg, kib
    PR:             264441
    MFC after:      1 month
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D35415
---
 sys/kern/sys_pipe.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
index 32ed17c0fe23..ed93a2b7f094 100644
--- a/sys/kern/sys_pipe.c
+++ b/sys/kern/sys_pipe.c
@@ -651,8 +651,8 @@ pipelock(struct pipe *cpipe, int catch)
 		    ("%s: bad waiter count %d", __func__,
 		    cpipe->pipe_waiters));
 		cpipe->pipe_waiters++;
-		error = msleep(cpipe, PIPE_MTX(cpipe),
-		    prio, "pipelk", 0);
+		error = msleep(&cpipe->pipe_waiters, PIPE_MTX(cpipe), prio,
+		    "pipelk", 0);
 		cpipe->pipe_waiters--;
 		if (error != 0)
 			return (error);
@@ -675,9 +675,8 @@ pipeunlock(struct pipe *cpipe)
 	    ("%s: bad waiter count %d", __func__,
 	    cpipe->pipe_waiters));
 	cpipe->pipe_state &= ~PIPE_LOCKFL;
-	if (cpipe->pipe_waiters > 0) {
-		wakeup_one(cpipe);
-	}
+	if (cpipe->pipe_waiters > 0)
+		wakeup_one(&cpipe->pipe_waiters);
 }
 
 void