tcp_output starving -- is due to mbuf get delay?

Borje Josefsson bj at
Sat Apr 12 03:37:30 PDT 2003

On Fri, 11 Apr 2003 15:07:59 PDT Terry Lambert wrote:

> Borje Josefsson wrote:

> > I did a quick test with some combination of the OID:s You sent, except I
> > didn't reboot between each test:
> The reboots was intended to keep the statistics counters relatively
> accurate between the FreeBSD and NteBSD sender runs.  By doing that,
> you can tell if what's happening on the receiver is the same for
> both sender machines.  If you don't reboot, then the statistics are
> polluted with other traffic, and can't be compared.

OK. I found out the -z flag to netstat, that clears the counters. Unfortunally NetBSD lacks this flag, so I rebooted that host several times :-( Just to be
I didn't have anything "old" lying around, I rebooted the FreeBSD host
before I started.

> You should also start clean on each sender, and get the same stats
> on the sender.  I would add "vmstat -i", to look at interrupt overhead.
> Note that FreeBSD jumbograms are in external mbufs allocated to the
> cards on receive.  On transmit, they are scatter/gathered.  NetBSD
> might not have this overhead.  The copy overhead there could account
> for a lot of CPU time.

I think NetBSD-current claims to do zero-copy transfers.
I added Anders Magnusson to the CC: of tis mail, he knows very much of
NetBSD networking internals. He surely can fill in some more details
on this.

> > Netstat -m (tcp and ip portion) when I started and after the trials:
> Side-by-side/interleaved is more useful.  I will do it manually for
> the ones that change; if we continue this discussion, you get to do
> the work in the future (b=before, A=after)

> b>                 0 resends initiated by MTU discovery
> b>                 6446084 ack-only packets (199 delayed)
> A>                 6446155 ack-only packets (207 delayed)
> 		        71                     8
> This is odd.  You must be sending data in both directions.  Thus
> the lower bandwidth could be the result of negotiated options; you
> may want to try turning _on_ rfc1644.

Did that. No difference in performance.
> The delayed ACKs are bad.  Can you either set "PUSH" on the socket,
> or turn off delayed ACK entirely?

Did that (tcp.delayed_ack=0). No apparent difference.

> All in all, there's not a lot of weird stuff going on; now you need
> to look at the NetBSD vs. the FreeBSD transmitters, in a similar
> way, get the deltas for both, and then compare them to each other.
> A really important thing to look at is the "vmstat -i" I asked for
> earlier, in order to get interrupt counts on the transmitter.  Most
> likely, there is a driver difference causing the "problem"; you
> should be able to see this in a differential for the transmit
> interrupt overhead being higher on the FreeBSD box.
> It would also be very interesting to compare the netstat numbsrs
> between the transmitters, as suggested above; the numbers should
> tell you about differences in implemntation on the driver side.

OK, here goes, as a first attempt to match sender and receiver data.

Appologies for the long lines - I have tried to "match" appropiate
sender and receiver lines below. *Note* that there are no "before" 
and "after" in the netstat figures, this is net values accumulated
during the test. In some cases there might be some odd packets
that doesn't have to do with my ttcp test (since I access the hosts
remotely), but I ran everything from a shell script to file, so the
difference should me minor.

I'll await comments on the data below before doing something more.


    sender=FreeBSD                      receiver=NetBSD

305178 packets sent                     305179 received
 305175 data packets (1249996800 bytes) 305176 packets (1249996800 bytes) in seq.
 0 data packets (0 bytes) retransmitted
 0 resends initiated by MTU discovery
 1 ack-only packet (0 delayed)
 0 URG only packets
 0 window probe packets
 0 window update packets                0 window update packets received
 2 control packets
205911 packets received                 206052 sent
 168215 acks (for 1249148976 bytes)     136850 ack-only packets (168328 delayed) sent
 0 duplicate acks
 0 acks for unsent data
 0 packets (0 bytes) received in-seq
 0 completely duplicate packets
 0 old duplicate packets
 0 packets with some dup. data
 0 out-of-order packets (0 bytes)
 0 packets of data after window
 0 window probes
 37696 window update packets            69201 window update packets sent
 0 packets received after close
 0 discarded for bad checksums
 0 discarded for bad header offset f.
 0 discarded because packet too short
168215 segments updated rtt (of 59609)  1 segments updated rtt (of 1 attempts)
9795 correct ACK header predictions     1 correct ACK header predictions
0 correct data packet header predict.   305175 correct data packet header predict.

205915 total packets received           206052 packets sent from this host
305179 packets sent from this host      305185 packets for this host

vmstat -i on *sender*     ===before===          ===after===
interrupt                total    rate        total     rate
ata0 irq14                   4       0            4        0
bge1 irq7                   48       0           48        0
mux irq11               372597     325       459967      396
mux irq10                   15       0           15        0
fdc0 irq6                    2       0            2        0
atkbd0 irq1                  1       0            1        0
clk irq0                114364      99       115893       99
rtc irq8                146388     127       148346      127
Total                   633419     553       724276      624

vmstat -i on *receiver*    ==before==           ==after==
interrupt                total     rate       total     rate
cpu0 softclock           16738       99       18687       99
cpu0 softnet               170        1       89848      480
cpu0 softserial              1        0           1        0
pic0 pin 11                264        1       90106      481
pic0 pin 14               1528        9        1564        8
pic0 pin 3                   1        0           1        0
pic0 pin 0               16910      100       18831      100
Total                    35612      211      219038     1171


    sender=NetBSD                          receiver=FreeBSD

*** tcp:
282935 packets sent                        282936 packets received
 282933 data packets (1249996800 bytes)    282933 packets (1249996800 bytes) received in-sequ
 0 data packets (0 bytes) retransmitted
 1 ack-only packets (32 delayed)           2 acks (for 49 bytes) received
 0 window probe packets
 0 window update packet
 1 control packet
 0 send attempts resulted in self-quench
187507 packets received                    187947 packets sent
 187131 acks (for 1247077744 bytes)        95364 ack-only packets (0 delayed) sent
 0 duplicate acks
 0 acks for unsent data
 0 packets (0 bytes) received in-sequence
 0 completely duplicate packets (0 bytes)
 0 old duplicate packets
 0 packets with some dup. data
 0 out-of-order packets (0 bytes)
 0 packets (0 bytes) of data after window
 0 window probes
 374 window update packets                 92582 window update packets sent
 0 packets received after close
 0 discarded for bad checksums
 0 discarded for bad header offset fields
 0 discarded because packet too short
1 connection request
0 connection accept
1 connections established (incl. accepts)  1 connection established (including accepts)
0 connection closed (including 0 drops)
0 embryonic connections dropped
182455 segments updated rtt (of 78677)     2 segments updated rtt (of 1 attempt)
0 retransmit timeouts
 0 connections dropped by rexmit timeout
0 persist timeouts
0 keepalive timeouts
 0 keepalive probes sent
 0 connections dropped by keepalive
14 correct ACK header predictions          1 correct ACK header prediction
                                           282931 correct data packet header predictions
0 correct data packet header pred.
0 PCB hash misses
0 dropped due to no socket
0 connections drained due to memory shortage
0 PMTUD blackholes detected
0 bad connection attempts
0 SYN cache entries added
        0 hash collisions
        0 completed
        0 aborted (no space to build PCB)
        0 timed out
        0 dropped due to overflow
        0 dropped due to bucket overflow
        0 dropped due to RST
        0 dropped due to ICMP unreachable
0 SYN,ACKs retransmitted
0 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)

*** ip:
187503 total packets received
0 bad header checksums                     0 bad header checksums
0 with size smaller than minimum           0 with size smaller than minimum
0 with data size < data length             0 with data size < data length
0 with length > max ip packet size         0 with ip length > max ip packet size
0 with header length < data size           0 with header length < data size
0 with data length < header length         0 with data length < header length
0 with bad options                         0 with bad options
0 with incorrect version number            0 with incorrect version number
0 fragments received                       0 fragments received
0 fragments dropped (dup or out of space)  0 fragments dropped (dup or out of space)
0 malformed fragments dropped
0 fragments dropped after timeout          0 fragments dropped after timeout
0 packets reassembled ok                   0 packets reassembled ok
187503 packets for this host               187947 packets sent from this host
0 packets for unknown/unsupported protocol
0 packets forwarded
0 packets not forwardable
0 redirects sent
282936 packets sent from this host         282936 total packets received
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
0 output datagrams fragmented              0 output datagrams fragmented
0 fragments created                        0 fragments created
0 datagrams that can't be fragmented       0 datagrams that can't be fragmented
0 datagrams with bad address in header     0 datagrams with bad address in header

vmstat -i on *sender*     ==before===           === after ===
interrupt               total     rate       total     rate
cpu0 softclock           4737       98        5777       99
cpu0 softnet               79        1       41426      714
cpu0 softserial             1        0           1        0
pic0 pin 11               146        3       41777      720
pic0 pin 14              1516       31        1537       26
pic0 pin 3                  1        0           1        0
pic0 pin 0               4905      102        5928      102
Total                   11385      237       96447     1662

vmstat -i on *receiver*  === before===           === after ===
interrupt               total     rate       total     rate
ata0 irq14                  4        0           4        0
bge1 irq7                  48        0          48        0
mux irq11              744037      564     1027879      771
mux irq10                  15        0          15        0
fdc0 irq6                   2        0           2        0
atkbd0 irq1                 1        0           1        0
clk irq0               131831      100      133175       99
rtc irq8               168746      128      170467      127
Total                 1044684      792     1331591      999

More information about the freebsd-hackers mailing list