iSCSI failing, MLX rx_ring errors ?
Ben RUBSON
ben.rubson at gmail.com
Wed Jan 4 13:47:50 UTC 2017
> On 03 Jan 2017, at 07:27, Meny Yossefi <menyy at mellanox.com> wrote:
>
>> From: owner-freebsd-net at freebsd.orgOn Behalf OfBen RUBSON
>> Sent: Monday, January 2, 2017 11:09:15 AM (UTC+00:00) Monrovia, Reykjavik
>> To: freebsd-net at freebsd.org
>> Cc: Meny Yossefi; Yuval Bason; Hans Petter Selasky
>> Subject: Re: iSCSI failing, MLX rx_ring errors ?
>>
>> Hi Meny,
>>
>> Thank you very much for your feedback.
>>
>> I think you are right, this could be a mbufs issue.
>> Here are some more numbers :
>>
>> # vmstat -z | grep -v "0, 0$"
>> ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
>> 4 Bucket: 32, 0, 2673, 28327, 88449799, 17317, 0
>> 8 Bucket: 64, 0, 449, 15609, 13926386, 4871, 0
>> 12 Bucket: 96, 0, 335, 5323, 10293892, 142872, 0
>> 16 Bucket: 128, 0, 533, 6070, 7618615, 472647, 0
>> 32 Bucket: 256, 0, 8317, 22133, 36020376, 563479, 0
>> 64 Bucket: 512, 0, 1238, 3298, 20138111, 11430742, 0
>> 128 Bucket: 1024, 0, 1865, 2963, 21162182, 158752, 0
>> 256 Bucket: 2048, 0, 1626, 450, 80253784, 4890164, 0
>> mbuf_jumbo_9k: 9216, 603712, 16400, 8744, 4128521064, 2661, 0
>>
>> # netstat -m
>> 32801/18814/51615 mbufs in use (current/cache/total)
>> 16400/9810/26210/4075058 mbuf clusters in use (current/cache/total/max)
>> 16400/9659 mbuf+clusters out of packet secondary zone in use (current/cache)
>> 0/8647/8647/2037529 4k (page size) jumbo clusters in use (current/cache/total/max)
>> 16400/8744/25144/603712 9k jumbo clusters in use (current/cache/total/max)
>> 0/0/0/339588 16k jumbo clusters in use (current/cache/total/max) 188600K/137607K/326207K bytes allocated to network (current/cache/total)
>> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
>> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
>> 0/2661/0 requests for jumbo clusters denied (4k/9k/16k)
>> 0 sendfile syscalls
>> 0 sendfile syscalls completed without I/O request
>> 0 requests for I/O initiated by sendfile
>> 0 pages read by sendfile as part of a request
>> 0 pages were valid at time of a sendfile request
>> 0 pages were requested for read ahead by applications
>> 0 pages were read ahead by sendfile
>> 0 times sendfile encountered an already busy page
>> 0 requests for sfbufs denied
>> 0 requests for sfbufs delayed
>>
>> I did not perform any mbufs tuning, numbers above are from FreeBSD itself.
>>
>> This server has 64GB of memory.
>> It has a ZFS pool for which I limit ARC memory impact with :
>> vfs.zfs.arc_max=64424509440 #60G
>>
>> The only thing I did is some TCP tuning to improve throughput over high-latency long-distance private links :
>> kern.ipc.maxsockbuf=7372800
>> net.inet.tcp.sendbuf_max=6553600
>> net.inet.tcp.recvbuf_max=6553600
>> net.inet.tcp.sendspace=65536
>> net.inet.tcp.recvspace=65536
>> net.inet.tcp.sendbuf_inc=65536
>> net.inet.tcp.recvbuf_inc=65536
>> net.inet.tcp.cc.algorithm=htcp
>>
>> Here are some graphs of memory & ARC usage when issue occurs.
>> Crosshair (vertical red line) is at the timestamp where I get iSCSI disconnections.
>> https://postimg.org/gallery/1kkekrc4e/
>> What is strange is that each time issue occurs there is around 1GB of free memory.
>> So FreeBSD should still be able to allocate some more mbufs ?
>> Unfortunately I do not have graphs about mbufs.
>>
>> What should I ideally do ?
>
> Have you tried increasing the mbufs limit?
> (sysctl) kern.ipc.nmbufs (Maximum number of mbufs allowed)
Thank you for your suggestion Meny.
No I did not try this yet.
However, from the numbers above (and below), I think I should increase kern.ipc.nmbjumbo9 instead ?
# vmstat -z | grep -E "ITEM|mbuf"
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
mbuf_packet: 256, 26080395, 16400, 10418, 572292683, 0, 0
mbuf: 256, 26080395, 16402, 40525, 20955366061, 0, 0
mbuf_cluster: 2048, 4075060, 26818, 148, 907005, 0, 0
mbuf_jumbo_page: 4096, 2037529, 0, 34262, 5194563127, 0, 0
mbuf_jumbo_9k: 9216, 603712, 16400, 12867, 4362104082, 2676, 0
mbuf_jumbo_16k: 16384, 339588, 0, 0, 0, 0, 0
# sysctl kern.ipc | grep mb
kern.ipc.nmbufs: 26080380
kern.ipc.nmbclusters: 4075058
kern.ipc.nmbjumbop: 2037529
kern.ipc.nmbjumbo9: 1811136
kern.ipc.nmbjumbo16: 1358352
kern.ipc.maxmbufmem: 33382879232
// note that I don't understand the difference between vmstat and sysctl nmbjumbo9 / nmbjumbo16 values.
// the first one is /3 (1811136/3=603712) the second one is /4 (1358352/4=339588). Strange.
// interesting post from Robert Watson which helped me understanding some numbers :
// https://lists.freebsd.org/pipermail/freebsd-net/2008-December/020350.html
More information about the freebsd-net
mailing list