Help with mbuf exhaustion

Josh Gitlin jgitlin at goboomtown.com
Thu Sep 28 19:30:13 UTC 2017


Hi FreeBSD Gurus!

We're having an issue with mbuf exhaustion on a FreeBSD server which was recently upgraded from 10.3-STABLE to 10.3-RELEASE-p2. Under the course of normal operation, we see mbuf usage steadily increasing until we reach kern.ipc.nmbufs limit, at which point the machine becomes unresponsive over the network (due to lack of mbufs for network access) and the console displays:

cxl0: Interface stopped DISTRIBUTING, possible flapping
cxl1: Interface stopped DISTRIBUTING, possible flapping
[zone: mbuf] kern.ipc.nmbufs limit reached
[zone: mbuf] kern.ipc.nmbufs limit reached
The machine runs pf and acts as a packet filter, router, gateway and DHCP/DNS server. It has two Chelsio NICs in it, and is a CARP master with a secondary. The secondary has identical configuration of hardware and software and does not exhibit this issue.

Given the downtime this causes, we set up our Nagios/Check_MK to graph the output of `netstat -m` and alert when mbufs in use approaches `kern.ipc.nmbufs` and we see a steady linear increase in mbuf usage until we reboot:

https://i.stack.imgur.com/8bzAq.png <https://i.stack.imgur.com/8bzAq.png>

mbuf *clusters* in use does not change when this happens and increasing mbuf cluster limits has no effect:

https://i.stack.imgur.com/7OzdN.png <https://i.stack.imgur.com/7OzdN.png>

This appears to be a kernel bug of some sort to me, looking for advice on further troubleshooting or assistance in resolving this!

Helpful (maybe) information:

netstat -m:

679270/3080/682350 mbufs in use (current/cache/total)
10243/1657/11900/985360 mbuf clusters in use (current/cache/total/max)
10243/1648 mbuf+clusters out of packet secondary zone in use (current/cache)
8128/482/8610/124025 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/36748 9k jumbo clusters in use (current/cache/total/max)
128/0/128/20670 16k jumbo clusters in use (current/cache/total/max)
224863K/6012K/230875K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile

vmstat -z|grep -E '^ITEM|mbuf':

ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
mbuf_packet:            256, 1587540,   10239,    1652,84058893,   0,   0
mbuf:                   256, 1587540,  671533,    1206,914478880,   0,   0
mbuf_cluster:          2048, 985360,   11891,       9,   11891,   0,   0
mbuf_jumbo_page:       4096, 124025,    8128,     512,15011847,   0,   0
mbuf_jumbo_9k:         9216,  36748,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384,  20670,     128,       0,     128,   0,   0
mbuf_ext_refcnt:          4,      0,       0,       0,       0,   0,   0

vmstat -m:

         Type InUse MemUse HighUse Requests  Size(s)
 NFSD lckfile     1     1K       -        1  256
     filedesc   103   383K       -  1134731  16,32,128,2048,4096,8192,16384,65536
        sigio     1     1K       -        1  64
     filecaps     0     0K       -      973  64
      kdtrace   292    59K       -  1099386  64,256
         kenv   121    13K       -      125  16,32,64,128,8192
       kqueue    14    22K       -     5374  256,2048,8192
    proc-args    54     5K       -   578448  16,32,64,128,256
        hhook     2     1K       -        2  256
      ithread   146    24K       -      146  32,128,256
       KTRACE   100    13K       -      100  128
       NFS fh     1     1K       -      584  32
       linker   207  1052K       -      234  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
        lockf    29     3K       -    20042  64,128
   loginclass     2     1K       -     1192  64
       devbuf 17205 36362K       -    17523  16,32,64,128,256,512,1024,2048,4096,8192,65536
         temp   149    51K       -  1280113  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
       ip6opt     5     2K       -        6  256
       ip6ndp    27     2K       -       27  64,128
       module   230    29K       -      230  128
     mtx_pool     2    16K       -        2  8192
          osd     3     1K       -        5  16,32,64
     pmchooks     1     1K       -        1  128
         pgrp    30     4K       -     2222  128
      session    29     4K       -     2187  128
         proc     2    32K       -        2  16384
      subproc   211   368K       -  1099014  512,4096
         cred   204    32K       -  6025704  64,256
       plimit    19     5K       -     3985  256
      uidinfo     9     5K       -    11892  128,4096
 NFSD session     1     1K       -        1  1024
       sysctl     0     0K       -    63851  16,32,64
    sysctloid  7196   365K       -     7369  16,32,64,128
    sysctltmp     0     0K       -    17834  16,32,64,128
      tidhash     1    32K       -        1  32768
      callout     5  2184K       -        5  
         umtx   522    66K       -      522  128
     p1003.1b     1     1K       -        1  16
         SWAP     2   549K       -        2  64
          bus   802    86K       -     6536  16,32,64,128,256,1024
       bus-sc    57  1671K       -     2431  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
    newnfsmnt     1     1K       -        1  1024
      devstat     8    17K       -        8  32,4096
 eventhandler   116    10K       -      116  64,128
         kobj   124   496K       -      296  4096
     acpiintr     1     1K       -        1  64
      Per-cpu     1     1K       -        1  32
       acpica 14355  1420K       -   216546  16,32,64,128,256,512,1024,2048,4096
     pci_link    16     2K       -       16  64,128
    pfs_nodes    21     6K       -       21  256
         rman   316    37K       -      716  16,32,128
         sbuf     1     1K       -    41375  16,32,64,128,256,512,1024,2048,4096,8192,16384
       sglist     8     8K       -        8  1024
         GEOM    88    15K       -     1871  16,32,64,128,256,512,1024,2048,8192,16384
      acpipwr     5     1K       -        5  64
    taskqueue    43     7K       -       43  16,32,256
       Unitno    22     2K       -  1208250  32,64
         vmem     3   144K       -        6  1024,4096,8192
     ioctlops     0     0K       -   185700  256,512,1024,2048,4096
       select    89    12K       -       89  128
          iov     0     0K       - 19808992  16,64,128,256,512,1024
          msg     4    30K       -        4  2048,4096,8192,16384
          sem     4   106K       -        4  2048,4096
          shm     1    32K       -        1  32768
          tty    20    20K       -      499  1024
          pts     1     1K       -      480  256
         accf     2     1K       -        2  64
     mbuf_tag     0     0K       - 291472282  32,64,128
        shmfd     1     8K       -        1  8192
       soname    32     4K       -  1210442  16,32,128
          pcb    36   663K       -    76872  16,32,64,128,1024,2048,8192
      CAM CCB     0     0K       -   182128  2048
          acl     0     0K       -        2  4096
     vfscache     1  2048K       -        1  
   cl_savebuf     0     0K       -      480  64
     vfs_hash     1  1024K       -        1  
       vnodes     1     1K       -        1  256
      entropy  1026    65K       -    49107  32,64,4096
        mount    64     3K       -      140  16,32,64,128,256
  vnodemarker     0     0K       -     4212  512
          BPF   112 20504K       -      131  16,64,128,512,4096
     CAM path    11     1K       -       63  32
        ifnet    29    57K       -       30  128,256,2048
       ifaddr   315   105K       -      315  32,64,128,256,512,2048,4096
  ether_multi   232    13K       -      282  16,32,64
        clone    10     2K       -       10  128
       arpcom    23     1K       -       23  16
          gif     4     1K       -        4  32,256
      lltable   155    53K       -      551  256,512
         UART     6     5K       -        6  16,1024
         vlan    56     5K       -       74  64,128
     acpitask     1    16K       -        1  16384
      acpisem   110    14K       -      110  128
    raid_data     0     0K       -      108  32,128,256
     routetbl   516   136K       -   101735  32,64,128,256,512
         igmp    28     7K       -       28  256
         CARP    76    30K       -       83  16,32,64,128,256,512,1024
         ipid     2    24K       -        2  8192,16384
   in_mfilter   112   112K       -      112  1024
     in_multi    43    11K       -       43  256
  ip_moptions   224    35K       -      224  64,256
   CAM periph     7     2K       -       19  16,32,64,128,256
      acpidev   128     8K       -      128  64
    CAM queue    15     5K       -       39  16,32,512
encap_export_host     4     4K       -        4  1024
    sctp_a_it     0     0K       -       36  16
     sctp_vrf     1     1K       -        1  64
     sctp_ifa   115    15K       -      204  128
     sctp_ifn    21     3K       -       23  128
    sctp_iter     0     0K       -       36  256
    hostcache     1    32K       -        1  32768
     syncache     1    64K       -        1  65536
  in6_mfilter     1     1K       -        1  1024
    in6_multi    15     2K       -       15  32,256
 ip6_moptions     2     1K       -        2  32,256
CAM dev queue     6     1K       -        6  64
       kbdmux     6    22K       -        6  16,512,1024,2048,16384
          mld    26     4K       -       26  128
          LED    20     2K       -       20  16,128
  inpcbpolicy   365    12K       -   119277  32
     secasvar     7     2K       -      214  256
       sahead    10     3K       -       10  256
  ipsecpolicy   748   187K       -   241562  256
 ipsecrequest    18     3K       -       72  128
   ipsec-misc    56     2K       -     1712  16,32,64
    ipsec-saq     0     0K       -       24  128
    ipsec-reg     3     1K       -        3  32
       pfsync     2     2K       -      893  32,256,1024
      pf_temp     0     0K       -       78  128
      pf_hash     3  2880K       -        3  
     pf_ifnet    36    11K       -     9510  256,2048
       pf_tag     7     1K       -        7  128
      pf_altq     5     2K       -      125  256
      pf_rule   964   904K       -    17500  128,1024
      pf_osfp  1130   115K       -    28250  64,128
     pf_table    49    98K       -      948  2048
       crypto    37    11K       -     1072  64,128,256,512,1024
        xform     7     1K       -  1530156  16,32,64,128,256
          rpc    12    20K       -      304  64,128,512,1024,8192
audit_evclass   187     6K       -      231  32
  ufs_dirhash    93    18K       -       93  16,32,64,128,256,512
    ufs_quota     1  1024K       -        1  
    ufs_mount     3    13K       -        3  512,4096,8192
    vm_pgdata     2   513K       -        2  128
      UMAHash     5     6K       -       10  512,1024,2048
      CAM SIM     6     2K       -        6  256
      CAM XPT    30     3K       -     1850  16,32,64,128,256,512,1024,2048,65536
      CAM DEV     9    18K       -       16  2048
  fpukern_ctx     3     6K       -        3  2048
      memdesc     1     4K       -        1  4096
          USB    23    33K       -       24  16,128,256,512,1024,2048,4096
       DEVFS3   136    34K       -     2027  256
       DEVFS1   108    54K       -      594  512
       apmdev     1     1K       -        1  128
   madt_table     0     0K       -        1  4096
   DEVFS_RULE    55    26K       -       55  64,512
        DEVFS    12     1K       -       13  16,128
       DEVFSP    22     2K       -      167  64
      io_apic     1     2K       -        1  2048
       isadev     8     1K       -        8  128
          MCA    15     2K       -       15  32,128
          msi    30     4K       -       30  128
     nexusdev     5     1K       -        5  16
       USBdev    21     8K       -       21  32,64,128,256,512,1024,4096
NFSD V4client     1     1K       -        1  256
         cdev     5     2K       -        5  256
        cxgbe    41   956K       -       44  128,256,512,1024,2048,4096,8192,16384
         ipmi     0     0K       -    20155  128,2048
    htcp data   127     4K       -    13675  32
   aesni_data     3     3K       -        3  1024
      solaris   142 12302K       -     3189  16,32,64,128,512,1024,8192
   kstat_data     6     1K       -        6  64

TCP States:

https://i.stack.imgur.com/G7850.png


--
 <http://www.goboomtown.com/>	
Josh Gitlin
Senior Full Stack Developer
(415) 690-1610 x155

Stay up to date and join the conversation in Relay <http://relay.goboomtown.com/>.



More information about the freebsd-net mailing list