kern/185813: SOCK_SEQPACKET AF_UNIX sockets with asymmetrical buffers drop packets

Thu Jan 23 18:27:00 UTC 2014

Well, shouldn't we fix the API/code so it doesn't drop packets,
regardless of the sensibility or non-sensibility of different
transmit/receive buffer sizes?

-a

On 23 January 2014 10:02, Alan Somers <asomers at freebsd.org> wrote:
> There is a buffer space calculation bug in the send path for
> SOCK_SEQPACKET AF_UNIX sockets.  The result is that, if the sending
> and receiving buffer sizes are different, the kernel will drop
> messages and leak mbufs.  A more detailed description is available in
> the PR.
>
> The labyrinthine nature of the networking code makes it difficult to
> directly fix the space calculation.  It's especially hard due to the
> optimization that AF_UNIX sockets have only a single socket buffer.
> As implemented, they store data in the receiving sockbuf, but use the
> transmitting sockbuf for space calculations.  That's even true of
> SOCK_STREAM sockets.  They only work due to an accident; they don't
> end up doing the same space calculation that trips up SOCK_SEQPACKET
> sockets.
>
> Instead, I propose modifying the kernel to force an AF_UNIX socket
> pair's buffers to always have the same size.  That is, if you call
> setsockopt(s, SOL_SOCKET, SO_SNDBUF, whatever, whatever), the kernel
> will adjust both s's send buffer and the connected socket's receive
> buffer.  This solution also solves another annoying problem: currently
> there is no way for a program to effectively change the size of its
> receiving buffers.  If you call setsockopt(s, SOL_SOCKET, SO_RCVBUF,
> whatever, whatever) on an AF_UNIX socket, it will have no effect on
> how packets are actually handled.
>
> The attached patch implements my suggestion for setsockopt.  It's
> obviously not perfect; it doesn't handle the case where you call
> setsockopt() before connect() and it introduces an unfortunate
> #include, but it's a working proof of concept.  With this patch, the
> recently added ATF test case
> sys/kern/unix_seqpacket_test:pipe_simulator_128k_8k passes.  Does this
> look like the correct approach?
>
>
> Index: uipc_socket.c
> ===================================================================
> --- uipc_socket.c    (revision 261055)
> +++ uipc_socket.c    (working copy)
> @@ -133,6 +133,8 @@
>  #include <sys/sx.h>
>  #include <sys/sysctl.h>
>  #include <sys/uio.h>
> +#include <sys/un.h>
> +#include <sys/unpcb.h>
>  #include <sys/jail.h>
>  #include <sys/syslog.h>
>  #include <netinet/in.h>
> @@ -2382,6 +2384,8 @@
>  int
>  sosetopt(struct socket *so, struct sockopt *sopt)
>  {
> +    struct socket* so2;
> +    struct unpcb *unpcb, *unpcb2;
>      int    error, optval;
>      struct    linger l;
>      struct    timeval tv;
> @@ -2503,6 +2507,32 @@
>                  }
>                  (sopt->sopt_name == SO_SNDBUF ? &so->so_snd :
>                      &so->so_rcv)->sb_flags &= ~SB_AUTOSIZE;
> +                if (so->so_proto->pr_domain->dom_family !=
> +                        PF_LOCAL ||
> +                    so->so_type != SOCK_SEQPACKET)
> +                    break;
> +                /*
> +                 * For unix domain seqpacket sockets, we set the
> +                 * bufsize on both ends of the socket.  PR
> +                 * kern/185813
> +                 */
> +                unpcb = (struct unpcb*)(so->so_pcb);
> +                if (NULL == unpcb)
> +                    break;    /* Shouldn't ever happen */
> +                unpcb2 = unpcb->unp_conn;
> +                if (NULL == unpcb2)
> +                    break;    /* For unconnected sockets */
> +                so2 = unpcb2->unp_socket;
> +                if (NULL == so2)
> +                    break;    /* Shouldn't ever happen? */
> +                if (sbreserve(sopt->sopt_name == SO_SNDBUF ?
> +                    &so2->so_rcv : &so2->so_snd, (u_long)optval,
> +                    so, curthread) == 0) {
> +                    error = ENOBUFS;
> +                    goto bad;
> +                }
> +                (sopt->sopt_name == SO_SNDBUF ? &so2->so_rcv :
> +                    &so2->so_snd)->sb_flags &= ~SB_AUTOSIZE;
>                  break;
>
>              /*
>
>
> -Alan
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"