FreeBSD 10 network performance problems
Oliver Pinter
oliver.pntr at gmail.com
Sun Sep 21 22:51:14 UTC 2014
On 9/20/14, Rumen Telbizov <telbizov at gmail.com> wrote:
> Hello everyone,
>
> I am in the process of upgrading our main PF firewalls from 9.2-RC4 to
> 10.1-BETA1 (r271684) and as part of the process I have been testing the
> forwarding capability of FreeBSD 10 (pf firewall disabled) to have a
> base-line and find any bottlenecks on a 10GbE network.
>
> My tests show that for the first 3-4Gbit/s of traffic things are great and
> then the machine simply "hits the wall" at around 4-5Gbit/s of traffic.
> There's no gradual degradation but a hard drop to 0% idle and 50-50% split
> of system and interrupt CPU load. I ran some diagnostics and I was hoping
> someone could point me in the right direction as to what is happening and
> what I could do to improve the situation. The details I have collected so
> far are below:
>
> I run multiple iperf tcp multithreaded instances to generate traffic which
> traverses the firewall. As mentioned above for the first 3-4Gbit/s traffic
> the machine doesn't even break a sweat.
>
> *top header* during this load when things are good:
> CPU 0: 0.0% user, 0.0% nice, 0.0% system, 27.6% interrupt, 72.4% idle
> CPU 1: 0.0% user, 0.0% nice, 0.0% system, 27.6% interrupt, 72.4% idle
> CPU 2: 0.0% user, 0.0% nice, 0.0% system, 8.7% interrupt, 91.3% idle
> CPU 3: 0.0% user, 0.0% nice, 0.0% system, 17.3% interrupt, 82.7% idle
> CPU 4: 0.0% user, 0.0% nice, 0.0% system, 12.6% interrupt, 87.4% idle
> CPU 5: 0.0% user, 0.0% nice, 0.0% system, 4.7% interrupt, 95.3% idle
> CPU 6: 0.0% user, 0.0% nice, 0.0% system, 13.4% interrupt, 86.6% idle
> CPU 7: 0.0% user, 0.0% nice, 0.0% system, 26.0% interrupt, 74.0% idle
> CPU 8: 0.0% user, 0.0% nice, 0.0% system, 16.5% interrupt, 83.5% idle
> CPU 9: 0.0% user, 0.0% nice, 0.0% system, 1.6% interrupt, 98.4% idle
> CPU 10: 0.0% user, 0.0% nice, 0.0% system, 19.7% interrupt, 80.3% idle
> CPU 11: 0.0% user, 0.0% nice, 0.0% system, 7.1% interrupt, 92.9% idle
> Full output at http://pastebin.com/gaaisXV8
>
> *bmon* at the same time:
> Interfaces │ RX bps pps %│ TX bps
> pps %
> ix0 │ 240.33MiB 242.20K │ 221.41MiB 236.68K
> ix1 │ 246.51MiB 248.80K │ 261.61MiB 250.95K
> >lagg0 │ 483.45MiB 492.42K │ 479.54MiB 488.02K
>
> MiB (RX
> Bytes/second) MiB (TX
> Bytes/second)
> 499.17 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 496.49 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 415.98 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 413.74 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 332.78 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 331.00 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 249.59 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 248.25 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 166.39 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 165.50 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 83.20
> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 82.75
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 1 5 10 15 20 25 30 35 40 45 50 55
> 60 1 5 10 15 20 25 30 35 40 45 50 55 60
> K (RX
> Packets/second) K (TX
> Packets/second)
> 508.27 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 505.14 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 423.56 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 420.95 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 338.85 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 336.76 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 254.14 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 252.57 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 169.42 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 168.38 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 84.71
> |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 84.19
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 1 5 10 15 20 25 30 35 40 45 50 55
> 60 1 5 10 15 20 25 30 35 40 45 50 55 60
>
> To profile it without having it degrade sooner I reduced traffic to 2Gbit/s
> and ran dtrace hotkernel and lock profiling. Here are the results:
>
> */usr/share/dtrace/toolkit/hotkernel* for 60 seconds:
> kernel`__mtx_lock_flags 5812 0.8%
> kernel`__mtx_unlock_flags 7200 1.0%
> kernel`acpi_cpu_c1 7799 1.1%
> kernel`__rw_rlock 11196 1.5%
> kernel`spinlock_exit 14547 2.0%
> kernel`cpu_idle 166700 22.8%
> kernel`sched_idletd 461883 63.1%
> Full output at http://pastebin.com/w7WfFwPG
>
> *lock profiling* for 60 seconds:
> $ head -n 2 good-locks ; cat good-locks | sort -n -k 4 | tail -n6
> debug.lock.prof.stats:
> max wait_max total wait_total count avg wait_avg
> cnt_hold cnt_lock name
> 22 24 94378 18481 264549 0
> 0 0 18639 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched
> lock 10)
> 22 24 112366 20360 219179 0
> 0 0 17220 /usr/src/sys/kern/sched_ule.c:888 (spin mutex:sched
> lock 2)
> 18 319 3486 22352 4233 0
> 5 0 1640 /usr/src/sys/kern/subr_taskqueue.c:344 (sleep
> mutex:taskqueue)
> 26 66 3219768 204875 14616220 0
> 0 0 133154 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry)
> 25 90 1923012 2353820 14615738 0
> 0 0 1562097 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep
> mutex:rtentry)
> 26 91 1398443 2391458 14616137 0
> 0 0 1604332 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry)
> Full output at http://pastebin.com/qiG3ZuAH
>
> Again, above measurements demonstrate the state of the good/healthy system
> under moderate traffic load of 3-4Gbit/s with pf disabled and fast
> forwarding enabled. Here are the same measurements when I add an additional
> 1Gbit/s of traffic. The point when performance tanks varies between
> 4-5Gbit/s but it's always sudden, without any gradual degradation but
> instead idle simply drops down to 0. Let's take a look:
>
> *top header* during this load when things are bad:
> CPU 0: 0.0% user, 0.0% nice, 44.6% system, 55.4% interrupt, 0.0% idle
> CPU 1: 0.0% user, 0.0% nice, 15.1% system, 84.9% interrupt, 0.0% idle
> CPU 2: 0.0% user, 0.0% nice, 59.0% system, 40.3% interrupt, 0.7% idle
> CPU 3: 0.0% user, 0.0% nice, 57.6% system, 42.4% interrupt, 0.0% idle
> CPU 4: 0.0% user, 0.0% nice, 34.5% system, 63.3% interrupt, 2.2% idle
> CPU 5: 0.0% user, 0.0% nice, 51.8% system, 48.2% interrupt, 0.0% idle
> CPU 6: 0.0% user, 0.0% nice, 33.8% system, 66.2% interrupt, 0.0% idle
> CPU 7: 0.7% user, 0.0% nice, 49.6% system, 49.6% interrupt, 0.0% idle
> CPU 8: 0.0% user, 0.0% nice, 66.2% system, 33.8% interrupt, 0.0% idle
> CPU 9: 0.0% user, 0.0% nice, 35.3% system, 64.7% interrupt, 0.0% idle
> CPU 10: 0.0% user, 0.0% nice, 54.7% system, 45.3% interrupt, 0.0% idle
> CPU 11: 0.0% user, 0.0% nice, 34.5% system, 65.5% interrupt, 0.0% idle
> Full output at http://pastebin.com/9an9ZWv2
>
> *bmon *at the same time shows inconsistent performance with big dips:
> Interfaces │ RX bps pps %│ TX bps
> pps %
> ix0 │ 153.89MiB 151.53K │ 179.69MiB 159.91K
> ix1 │ 176.56MiB 161.29K │ 145.17MiB 148.13K
> >lagg0 │ 327.23MiB 333.05K │ 322.14MiB 328.13K
>
> MiB (RX
> Bytes/second) MiB
> (TX Bytes/second)
> 657.86
> ..............|.............................................
> 648.60 ..............|.............................................
> 548.21
> ..............|.............................................
> 540.50 ..............|.............................................
> 438.57
> ...........|..|.........................|.....|.......|..|..
> 432.40 ...........|..|.........................|.....|.......|..|..
> 328.93
> |..||||..||||.||...||..|||.||.|.|||||||||.|||||||||.||||||..
> 324.30 |..||||..||||.||...||..|||.||.|.|||||||||.|||||||||.||||||..
> 219.29
> |..||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 216.20 |..||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 109.64
> |.|||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 108.10 |.|||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 1 5 10 15 20 25 30 35 40 45 50 55
> 60 1 5 10 15 20 25 30 35 40 45 50
> 55 60
> K (RX
> Packets/second) K
> (TX Packets/second)
> 670.41
> ..............|.............................................
> 660.27 ..............|.............................................
> 558.67
> ..............|.............................................
> 550.22 ..............|.............................................
> 446.94
> ...........|..|.........................|.....|.......|..|..
> 440.18 ...........|..|.........................|.....|.......|..|..
> 335.20
> |..||||..||||.||...||..|||.||.|.|||||||||.|||||||||.||||||..
> 330.13 |..||||..||||.||...||..|||.||.|.|||||||||.|||||||||.||||||..
> 223.47
> |..||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 220.09 |..||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 111.73
> |.|||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 110.04 |.|||||..||||.||..|||.||||.||.|||||||||||.|||||||||.||||||..
> 1 5 10 15 20 25 30 35 40 45 50 55
> 60 1 5 10 15 20 25 30 35 40 45 50
> 55 60
>
> */usr/share/dtrace/toolkit/hotkernel* for 60 seconds:
> kernel`_rw_runlock_cookie 7709 1.1%
> kernel`__rw_rlock 11182 1.6%
> kernel`ip_fastforward 12231 1.7%
> kernel`__mtx_lock_flags 22004 3.1%
> kernel`__mtx_unlock_flags 35614 5.0%
> kernel`__mtx_lock_sleep 560768 78.5%
> Full output at http://pastebin.com/NurKwkWL
>
> *lock profiling* for 60 seconds:
> $ head -n 2 bad-locks ; cat bad-locks | sort -n -k 4 | tail -n6
> debug.lock.prof.stats:
> max wait_max total wait_total count avg wait_avg
> cnt_hold cnt_lock name
> 401766 191987 1857397 194162 179 10376 1084
> 0 2 /usr/src/sys/kern/kern_sysctl.c:1601 (sx:sysctl mem)
> 21064 207907 62556 249066 396 157 628
> 0 3 /usr/src/sys/kern/kern_sysctl.c:1499 (sleep mutex:Giant)
> 1 370663 17 372573 86 0 4332
> 0 2 /usr/src/sys/kern/kern_exit.c:429 (sx:allproc)
> 14 648 8856844 46296098 15513956 0 2 0
> 1467849 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry)
> 15 958 13107581 252445472 15513486 0 16 0
> 9444644 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry)
> 12 779 12500816 286324556 15513872 0 18 0
> 9788497 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry)
> Full output at http://pastebin.com/d54QmP13
>
> System hardware is: 12 x E5-2440 @ 2.40GHz, 24GB RAM, Dual port fiber
> Intel(R) PRO/10GbE in lacp lagg
> System configuration details available at http://pastebin.com/tPvs1MeD
>
> It seems to me like heavy lock contention but I don't understand why it
> happens in such an abrupt manner after a given threshold.
> Other things I tried:
>
> - Upgrade ix (ixgbe) driver to the latest from Intel (2.5.25) - for some
> reason I cannot send any packets out
> - enable flowtables - no difference
>
> Any help is highly appreciated. I could provide further details and run
> additional tests upon request.
>
> Regrads,
> --
> Rumen Telbizov
> Unix Systems Administrator <http://telbizov.com>
Try to reenable the idle ticks. I observed similar problem with em driver.
sysctl kern.eventtimer.idletick=1
sysctl kern.eventtimer.periodic=1
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
More information about the freebsd-stable
mailing list