OpenSSH HPN
John-Mark Gurney
jmg at funkthat.com
Sat Nov 14 19:34:28 UTC 2015
Ok, some new testing... nc w/ 8MB sockbuf and {send,recv}buf_max can
get over 50MB/sec using nc and a 100ms RTT.. Guess it's a good day...
The hickups that I see look to be a buffer overrun on the tap or bridge
interface, as I did see some errors, and on the sender side, the TCP
protocol stats showed some retransmits..
A quick recap of the info below at 50ms RTT:
server
client HEAD w/ HPN stock 7.1p1
HEAD w/ HPN 1MB/sec 6.5MB/sec
stock 7.1p1 2MB/sec 7MB/sec
TL;DR: HPN patches hurt no matter how you look at things...
client HEAD w/ HPN, the tcpwinsz is always 66608
the server HEAD w/ HPN, the rcvd adjust is always 49152
the server stock 7.1p1, the rcvd adjust is always 131072
client HPN:
OpenSSH_6.6.1p1, OpenSSL 1.0.1m-freebsd 19 Mar 2015
client stock:
OpenSSH_7.1p1, OpenSSL 1.0.1m-freebsd 19 Mar 2015
server HPN:
OpenSSH_6.6.1p1, OpenSSL 1.0.1o-freebsd 12 Jun 2015
server stock:
OpenSSH_7.1p1, OpenSSL 1.0.1o-freebsd 12 Jun 2015
All tests below were done w/ a 25ms pipe delay, resulting in a 50ms
RTT...
Baseline nc results in ~55MB/sec, but this is cpu limited by bhyve...
The dual stock 7.1p is only using about a quarter of the cpu, so not
at all cpu limited...
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime
tcp4 0 2851352 192.168.0.21.44972 192.168.0.50.3845 0 1052 0 1043 66608 2851352 1 2048 0 4262912 532864 8388608 0.26 0.00 7199.99 0.00 0.00 0.00
I forgot about -vvv... So, w/ HEAD ssh (w/ HPN) on both sides, the
transmitter, I see lots of:
debug2: channel 0: rcvd adjust 49152
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
And the send-q is 49312:
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime
tcp4 0 49312 192.168.0.21.51135 192.168.0.50.22 0 15 0 13 66608 98840 1 2048 0 55040 532864 790720 0.21 0.00 7199.95 0.00 0.00 0.04
And adjusting the delay to 0 shows ssh getting 50MB/sec, so it should
be possible... adjusting the delay shows a linear change in bandwidth
(200ms -> 500KB/sec, 100ms -> 1MB/sec, 50ms -> 2MB/sec), so looks like
ssh is artifically limitting the window...
If I run w/ stock, compiled from source, OpenSSH 7.1p1 on the server
side (receiver side) and HEAD ssh on client side, I get ~6.5MB/sec..
w/ -vvv I see:
debug2: channel 0: rcvd adjust 131072
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
debug2: tcpwinsz: 66608 for connection: 3
[this line repeated tons of times]
In this case, I do see the send-q stay at around 229912:
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime
tcp4 0 229912 192.168.0.21.55975 192.168.0.50.22 0 77 0 77 66608 229912 1 2048 0 316672 532864 1839296 0.24 0.00 7199.99 0.00 0.00 0.00
With compiled from source, OpenSSH 7.1.p1 on both sides, I get around
7MB/sec and -vvv gives:
debug2: channel 0: rcvd adjust 131072
And the send-q stays around 229912:
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime
tcp4 0 229912 192.168.0.21.10699 192.168.0.50.22 0 80 0 80 66608 229912 1 2048 0 339968 532864 1839296 0.25 0.00 7199.99 0.00 0.00 0.00
With compiled from source on the client, and HEAD OpenSSH w/ HPN on
the server side, 2MB/sec, -vvv shows:
debug2: channel 0: rcvd adjust 49152
and the send-q stays around 49296:
Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX rexmt persist keep 2msl delack rcvtime
tcp4 0 49296 192.168.0.21.43918 192.168.0.50.22 0 15 0 13 66608 98840 1 2048 0 55040 532864 790720 0.20 0.00 7199.95 0.00 0.00 0.04
/etc/ssh/ssh{,d}_config are all stock (except PermitRootLogin is yes
on the VM), and the client side's .ssh/config only contains a single
commented out line...
P.S. I've had two panics on the dummynet box (running nc) the last being:
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff80963615
stack pointer = 0x28:0xfffffe001de65580
frame pointer = 0x28:0xfffffe001de655d0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (dummynet)
[ thread pid 0 tid 100084 ]
Stopped at __mtx_lock_flags+0x55: movq (%r13),%rax
db> bt
Tracing pid 0 tid 100084 td 0xfffff8000505c000
__mtx_lock_flags() at __mtx_lock_flags+0x55/frame 0xfffffe001de655d0
doselwakeup() at doselwakeup+0xb5/frame 0xfffffe001de65610
sowakeup() at sowakeup+0x3b/frame 0xfffffe001de65640
tcp_do_segment() at tcp_do_segment+0x26b9/frame 0xfffffe001de65730
tcp_input() at tcp_input+0xfb7/frame 0xfffffe001de65880
ip_input() at ip_input+0x175/frame 0xfffffe001de658e0
netisr_dispatch_src() at netisr_dispatch_src+0x86/frame 0xfffffe001de65950
dummynet_send() at dummynet_send+0x153/frame 0xfffffe001de65980
dummynet_task() at dummynet_task+0x2e3/frame 0xfffffe001de659e0
taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xfffffe001de65a40
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe001de65a70
fork_exit() at fork_exit+0x84/frame 0xfffffe001de65ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001de65ab0
--- trap 0, rip = 0, rsp = 0xfffffe001de65b70, rbp = 0 ---
db>
I wasn't able to get a dump because my swap was misconfigured...
--
John-Mark Gurney Voice: +1 415 225 5579
"All that I will do, has been done, All that I have, has not."
More information about the freebsd-current
mailing list