Re: AF_UNIX socketpair dgram queue sizes
- In reply to: Jan Schaumann via freebsd-net : "Re: AF_UNIX socketpair dgram queue sizes"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 10 Nov 2021 15:53:29 UTC
On Wed, Nov 10, 2021 at 12:05:33AM -0500, Jan Schaumann via freebsd-net wrote: > Mark Johnston <markj@freebsd.org> wrote: > > > There is an additional factor: wasted space. When writing data to a > > socket, the kernel buffers that data in mbufs. All mbufs have some > > amount of embedded storage, and the kernel accounts for that storage, > > whether or not it's used. With small byte datagrams there can be a lot > > of overhead; > > I'm observing two mbufs being allocated for each > datagram for small datagrams, but only one mbuf for > larger datagrams. > > That seems counter-intuitive to me? From my reading, sbappendaddr_locked_internal() will always allocate an extra mbuf for the address, so I can't explain this. What's the threshold for "larger"? How are you counting mbuf allocations? > > The kern.ipc.sockbuf_waste_factor sysctl controls the upper limit on > > total bytes (used or not) that may be enqueued in a socket buffer. The > > default value of 8 means that we'll waste up to 7 bytes per byte of > > data, I think. Setting it higher should let you enqueue more messages. > > Ah, this looks like something relevant. > > Setting kern.ipc.sockbuf_waste_factor=1, I can only > write 8 1-byte datagrams. For any increase of the > waste factor by one, I get another 8 1-byte datagrams, > up until waste factor > 29, at which point we hit > recvspace: 30 * 8 = 240, so 240 1-byte datagrams with > 16 bytes dgram overhead means we get 240*17 = 4080 > bytes, which just fits (well, with room for one empty > 16-byte dgram) into the recvspace = 4096. > > But I still don't get the direct relationship between > the waste factor and the recvspace / buffer queue: > with a waste_factor of 1 and a datagram with 1972 > bytes, I'm able to write one dgram with 1972 bytes + > 1 dgram with 1520 bytes = 3492 bytes (plus 2 * 16 > bytes overhead = 3524 bytes). There'd still have been > space for 572 more bytes in the second dgram. For a datagram of size 1972, we'll allocate one mbuf (size 256 bytes) and one mbuf "cluster" (2048 bytes), and then a second 256 byte mbuf for the address. So sb_mbcnt will be 2560 bytes, leaving 1536 bytes of space for a second datagram. > Liekwise, trying to write a single 1973 dgram fills > the queue and no additional bytes can be written in a > second dgram, but I can write a single 2048 byte > dgram. I suspect that this bit of the unix socket code might be related: https://cgit.freebsd.org/src/tree/sys/kern/uipc_usrreq.c#n1144 Here we get the amount of space available in the recv buffer (sbcc) and compare it with the data limit in the _send_ buffer to determine whether to apply backpressure. You wrote "SO_SNDBUF = 2048" in your first email, and if that's the case here then writing ~2000 bytes would cause the limit to be hit. I'm not sure why 1973 is the magic value here. > Still confused...