From nobody Fri Jun 24 16:10:35 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id D9C478658A5; Fri, 24 Jun 2022 16:10:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LV2Cb6Fzkz3JNv; Fri, 24 Jun 2022 16:10:35 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1656087036; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=KtYldSV3v+nMHpsQjwNWYp12/RQ/+yb2eLLZMtfRlQ4=; b=VqyGDQKptoWKOffJQK0jCHoNbaeq5BnMG5a5Z2lZHFAk8cPtJRbj8pZQzL1Z5HqdH1iPNF lCllzr/tIPR4tsUX3NZHJsXNp/Ue6OiJfDlN/ONFy2y14MdWvLTnCQwblNv9BdnopKaowp wAFLVUHKcTEL+ITI54pP9qvNvM6q6Tm94Lvf7QLWydh1V3UIyVCwCsCY0fhc0oRO/3+ZlD vVx+FjiqDis1iXx/pTa5I3VTUXuT2+tDJO+pAd4/J7xvHrO9w0mlC+CfD4whRA/GfXbLQs RD5yU1HufK1DB8IeKAjlYiRYc/2sDr08N5tGrFNx0tM4w8Is4iI38AKK65MKuQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 81AF312B67; Fri, 24 Jun 2022 16:10:35 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 25OGAZa8006217; Fri, 24 Jun 2022 16:10:35 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 25OGAZAm006216; Fri, 24 Jun 2022 16:10:35 GMT (envelope-from git) Date: Fri, 24 Jun 2022 16:10:35 GMT Message-Id: <202206241610.25OGAZAm006216@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Gleb Smirnoff Subject: git: a4fc41423f7d - main - sockets: enable protocol specific socket buffers List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: glebius X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: a4fc41423f7d6e43287822212f0e9db7aab83d39 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1656087036; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=KtYldSV3v+nMHpsQjwNWYp12/RQ/+yb2eLLZMtfRlQ4=; b=PuMydJB4jPg43aoTEUGY8Eg26eJZ0pXZ6P8A4di+1sty/uMJFZIEoDphG1i43K6XZ3bXoD RzS8MHljLENhpJD8qDNGcu5Y/5fSH25O4rt472e5DIr5+7a1/DxdomwgudzLwCu+tt1lh0 9sHhL3aAinGED7zIV0q33NW8Wtm3f5rlUONz+7EphGwG/KQdRtXPDPH+tPADRgDJ2u61/R Q2ydRoU6Zdri/grT9LZOELKhlJk0/Tayg6W8iEmJtvcGD7RtHD9Un+YK+axOWUKLdysvfi 0AbIaLncxR94mGC0Lq5P0KmpraQ5D87spTxeXx2Lf2sCJCw8WcBeVcafT8hFbA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1656087036; a=rsa-sha256; cv=none; b=M2Px0zV2IoR5ojwnKcK0ir6oh2C7BaNEeu5mlcnkuv7f8SUxJToQWqmzOq2CAMKu2ityWe GotD5HnJVdRmoy+3c/4Ipfa27WclvsS8A7M0+BlYLuE3lRBxfCQutmE0e3yhRkYZZzAFsa QnPYq1nlTeCFo+AJMOhIdqfaATvSjbZKDMd+WjS0iWtXxgt0xThAArvxdodBcWz6t9M3Ay DJkYJ2cKS41NT2W3OO4n+MguWr91Q90/MDKRP1IoH7GlC+ImeKMZXCOS4XIoxPnR1qdV8J /aQQClPt0mUE9REdunDlffShBJuQptpXzUDkDD9VBW+W9bVzJLhYvECdozmKcA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by glebius: URL: https://cgit.FreeBSD.org/src/commit/?id=a4fc41423f7d6e43287822212f0e9db7aab83d39 commit a4fc41423f7d6e43287822212f0e9db7aab83d39 Author: Gleb Smirnoff AuthorDate: 2022-06-24 16:09:10 +0000 Commit: Gleb Smirnoff CommitDate: 2022-06-24 16:09:10 +0000 sockets: enable protocol specific socket buffers Split struct sockbuf into common shared fields and protocol specific union, where protocols are free to implement whatever buffer they want. Such protocols should mark themselves with PR_SOCKBUF and are expected to initialize their buffers in their pr_attach and tear them down in pr_detach. Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35299 --- sys/kern/uipc_socket.c | 12 +++++-- sys/sys/protosw.h | 3 ++ sys/sys/sockbuf.h | 86 ++++++++++++++++++++++++++++++++------------------ 3 files changed, 67 insertions(+), 34 deletions(-) diff --git a/sys/kern/uipc_socket.c b/sys/kern/uipc_socket.c index a2241464e35b..ffaf5acdd05d 100644 --- a/sys/kern/uipc_socket.c +++ b/sys/kern/uipc_socket.c @@ -418,8 +418,6 @@ soalloc(struct vnet *vnet) * a feature to change class of an existing lock, so we use DUPOK. */ mtx_init(&so->so_lock, "socket", NULL, MTX_DEF | MTX_DUPOK); - so->so_snd.sb_mtx = &so->so_snd_mtx; - so->so_rcv.sb_mtx = &so->so_rcv_mtx; mtx_init(&so->so_snd_mtx, "so_snd", NULL, MTX_DEF); mtx_init(&so->so_rcv_mtx, "so_rcv", NULL, MTX_DEF); so->so_rcv.sb_sel = &so->so_rdsel; @@ -557,6 +555,10 @@ socreate(int dom, struct socket **aso, int type, int proto, so_rdknl_assert_lock); knlist_init(&so->so_wrsel.si_note, so, so_wrknl_lock, so_wrknl_unlock, so_wrknl_assert_lock); + if ((prp->pr_flags & PR_SOCKBUF) == 0) { + so->so_snd.sb_mtx = &so->so_snd_mtx; + so->so_rcv.sb_mtx = &so->so_rcv_mtx; + } /* * Auto-sizing of socket buffers is managed by the protocols and * the appropriate flags must be set in the pru_attach function. @@ -756,6 +758,10 @@ sonewconn(struct socket *head, int connstatus) __func__, head->so_pcb); return (NULL); } + if ((so->so_proto->pr_flags & PR_SOCKBUF) == 0) { + so->so_snd.sb_mtx = &so->so_snd_mtx; + so->so_rcv.sb_mtx = &so->so_rcv_mtx; + } if ((*so->so_proto->pr_usrreqs->pru_attach)(so, 0, NULL)) { sodealloc(so); log(LOG_DEBUG, "%s: pcb %p: pru_attach() failed\n", @@ -1207,7 +1213,7 @@ sofree(struct socket *so) * socket exist anywhere else in the stack. Therefore, no locks need * to be acquired or held. */ - if (!SOLISTENING(so)) { + if (!(pr->pr_flags & PR_SOCKBUF) && !SOLISTENING(so)) { sbdestroy(so, SO_SND); sbdestroy(so, SO_RCV); } diff --git a/sys/sys/protosw.h b/sys/sys/protosw.h index dc550d42f1fd..26cd1bc3fc16 100644 --- a/sys/sys/protosw.h +++ b/sys/sys/protosw.h @@ -114,6 +114,8 @@ struct protosw { * and the protocol understands the MSG_EOF flag. The first property is * is only relevant if PR_CONNREQUIRED is set (otherwise sendto is allowed * anyhow). + * PR_SOCKBUF requires protocol to initialize and destroy its socket buffers + * in its pr_attach and pr_detach. */ #define PR_ATOMIC 0x01 /* exchange atomic messages only */ #define PR_ADDR 0x02 /* addresses given with messages */ @@ -123,6 +125,7 @@ struct protosw { #define PR_IMPLOPCL 0x20 /* implied open/close */ #define PR_LASTHDR 0x40 /* enforce ipsec policy; last header */ #define PR_CAPATTACH 0x80 /* socket can attach in cap mode */ +#define PR_SOCKBUF 0x100 /* private implementation of buffers */ /* * In earlier BSD network stacks, a single pr_usrreq() function pointer was diff --git a/sys/sys/sockbuf.h b/sys/sys/sockbuf.h index 31c351860a94..7800b2790c04 100644 --- a/sys/sys/sockbuf.h +++ b/sys/sys/sockbuf.h @@ -75,41 +75,65 @@ struct thread; struct selinfo; /* - * Variables for socket buffering. + * Socket buffer * - * Locking key to struct sockbuf: - * (a) locked by SOCKBUF_LOCK(). + * A buffer starts with the fields that are accessed by I/O multiplexing + * APIs like select(2), kevent(2) or AIO and thus are shared between different + * buffer implementations. They are protected by the SOCK_RECVBUF_LOCK() + * or SOCK_SENDBUF_LOCK() of the owning socket. + * + * XXX: sb_acc, sb_ccc and sb_mbcnt shall become implementation specific + * methods. + * + * Protocol specific implementations follow in a union. */ struct sockbuf { - struct mtx *sb_mtx; /* sockbuf lock */ struct selinfo *sb_sel; /* process selecting read/write */ - short sb_state; /* (a) socket state on sockbuf */ - short sb_flags; /* (a) flags, see above */ - struct mbuf *sb_mb; /* (a) the mbuf chain */ - struct mbuf *sb_mbtail; /* (a) the last mbuf in the chain */ - struct mbuf *sb_lastrecord; /* (a) first mbuf of last - * record in socket buffer */ - struct mbuf *sb_sndptr; /* (a) pointer into mbuf chain */ - struct mbuf *sb_fnrdy; /* (a) pointer to first not ready buffer */ - u_int sb_sndptroff; /* (a) byte offset of ptr into chain */ - u_int sb_acc; /* (a) available chars in buffer */ - u_int sb_ccc; /* (a) claimed chars in buffer */ - u_int sb_hiwat; /* (a) max actual char count */ - u_int sb_mbcnt; /* (a) chars of mbufs used */ - u_int sb_mbmax; /* (a) max chars of mbufs to use */ - u_int sb_ctl; /* (a) non-data chars in buffer */ - u_int sb_tlscc; /* (a) TLS chain characters */ - u_int sb_tlsdcc; /* (a) TLS characters being decrypted */ - int sb_lowat; /* (a) low water mark */ - sbintime_t sb_timeo; /* (a) timeout for read/write */ - struct mbuf *sb_mtls; /* (a) TLS mbuf chain */ - struct mbuf *sb_mtlstail; /* (a) last mbuf in TLS chain */ - int (*sb_upcall)(struct socket *, void *, int); /* (a) */ - void *sb_upcallarg; /* (a) */ - uint64_t sb_tls_seqno; /* (a) TLS seqno */ - struct ktls_session *sb_tls_info; /* (a + b) TLS state */ - TAILQ_HEAD(, kaiocb) sb_aiojobq; /* (a) pending AIO ops */ - struct task sb_aiotask; /* AIO task */ + short sb_state; /* socket state on sockbuf */ + short sb_flags; /* flags, see above */ + u_int sb_acc; /* available chars in buffer */ + u_int sb_ccc; /* claimed chars in buffer */ + u_int sb_mbcnt; /* chars of mbufs used */ + u_int sb_ctl; /* non-data chars in buffer */ + u_int sb_hiwat; /* max actual char count */ + u_int sb_lowat; /* low water mark */ + u_int sb_mbmax; /* max chars of mbufs to use */ + sbintime_t sb_timeo; /* timeout for read/write */ + int (*sb_upcall)(struct socket *, void *, int); + void *sb_upcallarg; + TAILQ_HEAD(, kaiocb) sb_aiojobq; /* pending AIO ops */ + struct task sb_aiotask; /* AIO task */ + union { + /* + * Classic BSD one-size-fits-all socket buffer, capable of + * doing streams and datagrams. The stream part is able + * to perform special features: + * - not ready data (sendfile) + * - TLS + */ + struct { + /* compat: sockbuf lock pointer */ + struct mtx *sb_mtx; + /* first and last mbufs in the chain */ + struct mbuf *sb_mb; + struct mbuf *sb_mbtail; + /* first mbuf of last record in socket buffer */ + struct mbuf *sb_lastrecord; + /* pointer to data to send next (TCP */ + struct mbuf *sb_sndptr; + /* pointer to first not ready buffer */ + struct mbuf *sb_fnrdy; + /* byte offset of ptr into chain, used with sb_sndptr */ + u_int sb_sndptroff; + /* TLS */ + u_int sb_tlscc; /* TLS chain characters */ + u_int sb_tlsdcc; /* characters being decrypted */ + struct mbuf *sb_mtls; /* TLS mbuf chain */ + struct mbuf *sb_mtlstail; /* last mbuf in TLS chain */ + uint64_t sb_tls_seqno; /* TLS seqno */ + struct ktls_session *sb_tls_info; /* TLS state */ + }; + }; }; #endif /* defined(_KERNEL) || defined(_WANT_SOCKET) */