kern/185813: SOCK_SEQPACKET AF_UNIX sockets with asymmetrical buffers drop packets
Alan Somers
asomers at freebsd.org
Fri Jan 24 00:31:07 UTC 2014
On Thu, Jan 23, 2014 at 11:26 AM, Adrian Chadd <adrian at freebsd.org> wrote:
> Well, shouldn't we fix the API/code so it doesn't drop packets,
> regardless of the sensibility or non-sensibility of different
> transmit/receive buffer sizes?
That would be nice, but it may be beyond my ability to do so. The
relevant code is very complicated, and most of it is in
domain-agnostic code where we can't introduce AF_UNIX specific special
cases.
It may be possible to change the single-buffer optimization to use the
receiving sockbuf's size for space calculations in uipc_send() instead
of the transmitting sockbuf's size. I could try to do that, though it
may cause existing programs to fail if they're depending on
setsockopt(s, SOL_SOCKET, SO_SNDBUF, ...) to have an effect.
-Alan
>
>
> -a
>
>
> On 23 January 2014 10:02, Alan Somers <asomers at freebsd.org> wrote:
>> There is a buffer space calculation bug in the send path for
>> SOCK_SEQPACKET AF_UNIX sockets. The result is that, if the sending
>> and receiving buffer sizes are different, the kernel will drop
>> messages and leak mbufs. A more detailed description is available in
>> the PR.
>>
>> The labyrinthine nature of the networking code makes it difficult to
>> directly fix the space calculation. It's especially hard due to the
>> optimization that AF_UNIX sockets have only a single socket buffer.
>> As implemented, they store data in the receiving sockbuf, but use the
>> transmitting sockbuf for space calculations. That's even true of
>> SOCK_STREAM sockets. They only work due to an accident; they don't
>> end up doing the same space calculation that trips up SOCK_SEQPACKET
>> sockets.
>>
>> Instead, I propose modifying the kernel to force an AF_UNIX socket
>> pair's buffers to always have the same size. That is, if you call
>> setsockopt(s, SOL_SOCKET, SO_SNDBUF, whatever, whatever), the kernel
>> will adjust both s's send buffer and the connected socket's receive
>> buffer. This solution also solves another annoying problem: currently
>> there is no way for a program to effectively change the size of its
>> receiving buffers. If you call setsockopt(s, SOL_SOCKET, SO_RCVBUF,
>> whatever, whatever) on an AF_UNIX socket, it will have no effect on
>> how packets are actually handled.
>>
>> The attached patch implements my suggestion for setsockopt. It's
>> obviously not perfect; it doesn't handle the case where you call
>> setsockopt() before connect() and it introduces an unfortunate
>> #include, but it's a working proof of concept. With this patch, the
>> recently added ATF test case
>> sys/kern/unix_seqpacket_test:pipe_simulator_128k_8k passes. Does this
>> look like the correct approach?
>>
>>
>> Index: uipc_socket.c
>> ===================================================================
>> --- uipc_socket.c (revision 261055)
>> +++ uipc_socket.c (working copy)
>> @@ -133,6 +133,8 @@
>> #include <sys/sx.h>
>> #include <sys/sysctl.h>
>> #include <sys/uio.h>
>> +#include <sys/un.h>
>> +#include <sys/unpcb.h>
>> #include <sys/jail.h>
>> #include <sys/syslog.h>
>> #include <netinet/in.h>
>> @@ -2382,6 +2384,8 @@
>> int
>> sosetopt(struct socket *so, struct sockopt *sopt)
>> {
>> + struct socket* so2;
>> + struct unpcb *unpcb, *unpcb2;
>> int error, optval;
>> struct linger l;
>> struct timeval tv;
>> @@ -2503,6 +2507,32 @@
>> }
>> (sopt->sopt_name == SO_SNDBUF ? &so->so_snd :
>> &so->so_rcv)->sb_flags &= ~SB_AUTOSIZE;
>> + if (so->so_proto->pr_domain->dom_family !=
>> + PF_LOCAL ||
>> + so->so_type != SOCK_SEQPACKET)
>> + break;
>> + /*
>> + * For unix domain seqpacket sockets, we set the
>> + * bufsize on both ends of the socket. PR
>> + * kern/185813
>> + */
>> + unpcb = (struct unpcb*)(so->so_pcb);
>> + if (NULL == unpcb)
>> + break; /* Shouldn't ever happen */
>> + unpcb2 = unpcb->unp_conn;
>> + if (NULL == unpcb2)
>> + break; /* For unconnected sockets */
>> + so2 = unpcb2->unp_socket;
>> + if (NULL == so2)
>> + break; /* Shouldn't ever happen? */
>> + if (sbreserve(sopt->sopt_name == SO_SNDBUF ?
>> + &so2->so_rcv : &so2->so_snd, (u_long)optval,
>> + so, curthread) == 0) {
>> + error = ENOBUFS;
>> + goto bad;
>> + }
>> + (sopt->sopt_name == SO_SNDBUF ? &so2->so_rcv :
>> + &so2->so_snd)->sb_flags &= ~SB_AUTOSIZE;
>> break;
>>
>> /*
>>
>>
>> -Alan
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list