git: 574816356834 - main - socket: Fix a race in the SO_SPLICE state machine

From: Mark Johnston <markj_at_FreeBSD.org>
Date: Sun, 23 Mar 2025 11:59:56 UTC
The branch main has been updated by markj:

URL: https://cgit.FreeBSD.org/src/commit/?id=574816356834cb99295b124be0ec34bd9e0b9c72

commit 574816356834cb99295b124be0ec34bd9e0b9c72
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-03-23 11:55:56 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-03-23 11:55:56 +0000

    socket: Fix a race in the SO_SPLICE state machine
    
    When so_splice() links two sockets together, it first attaches the
    splice control structure to the source socket; at that point, the splice
    is in the idle state.  After that point, a socket wakeup will queue up
    work for a splice worker thread: in particular, so_splice_dispatch()
    only queues work if the splice is idle.
    
    Meanwhile, so_splice() continues initializing the splice, and finally
    calls so_splice_xfer() to transfer any already buffered data.  This
    assumes that the splice is still idle, but that's not true if some async
    work was already dispatched.
    
    Solve the problem by introducing an initial "under construction" state
    for the splice control structure, such that wakeups won't queue any work
    until so_splice() has finished.
    
    While here, remove an outdated comment from the beginning of
    so_splice_xfer().
    
    Reported by:    syzkaller
    Reviewed by:    gallatin
    Fixes:          a1da7dc1cdad ("socket: Implement SO_SPLICE")
    MFC after:      2 weeks
    Differential Revision:  https://reviews.freebsd.org/D49437
---
 sys/kern/uipc_socket.c | 7 +------
 sys/sys/socketvar.h    | 1 +
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/sys/kern/uipc_socket.c b/sys/kern/uipc_socket.c
index 03bfea721dd2..c27b007cafc6 100644
--- a/sys/kern/uipc_socket.c
+++ b/sys/kern/uipc_socket.c
@@ -592,11 +592,6 @@ so_splice_xfer_data(struct socket *so_src, struct socket *so_dst, off_t max,
 
 /*
  * Transfer data from the source to the sink.
- *
- * If "direct" is true, the transfer is done in the context of whichever thread
- * is operating on one of the socket buffers.  We do not know which locks are
- * held, so we can only trylock the socket buffers; if this fails, we fall back
- * to the worker thread, which invokes this routine with "direct" set to false.
  */
 static void
 so_splice_xfer(struct so_splice *sp)
@@ -1638,7 +1633,7 @@ so_splice_alloc(off_t max)
 		sp->wq_index = atomic_fetchadd_32(&splice_index, 1) %
 		    (mp_maxid + 1);
 	} while (CPU_ABSENT(sp->wq_index));
-	sp->state = SPLICE_IDLE;
+	sp->state = SPLICE_INIT;
 	TIMEOUT_TASK_INIT(taskqueue_thread, &sp->timeout, 0, so_splice_timeout,
 	    sp);
 	return (sp);
diff --git a/sys/sys/socketvar.h b/sys/sys/socketvar.h
index 735ff49062de..02d0ca139fa4 100644
--- a/sys/sys/socketvar.h
+++ b/sys/sys/socketvar.h
@@ -80,6 +80,7 @@ struct so_splice {
 	struct mtx mtx;
 	unsigned int wq_index;
 	enum so_splice_state {
+		SPLICE_INIT,	/* embryonic state, don't queue work yet */
 		SPLICE_IDLE,	/* waiting for work to arrive */
 		SPLICE_QUEUED,	/* a wakeup has queued some work */
 		SPLICE_RUNNING,	/* currently transferring data */