vmx: strange issue, related to to tso?
Patrick Kelsey
pkelsey at freebsd.org
Sat Dec 28 13:33:47 UTC 2019
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236999 appears to be converging to the same issue. If you want to create a review for the proposed patch that would be great.
It would be good to have corroborating test results for the proposed patch, something I will probably not be able to try to obtain for at least a couple of days.
-Patrick
> On Dec 28, 2019, at 3:17 AM, Vincenzo Maffione <vmaffione at freebsd.org> wrote:
>
>
> I think you are correct. Good catch!
> We should file a bug and/or create a review on the Phabricator (If you are busy I could do that).
>
> Thanks,
> Vincenzo
>
>> Il giorno sab 28 dic 2019 alle ore 05:44 Patrick Kelsey <pkelsey at freebsd.org> ha scritto:
>>
>>> On Fri, Dec 27, 2019 at 5:01 PM Andriy Gapon <avg at freebsd.org> wrote:
>>> On 27/12/2019 15:34, Vincenzo Maffione wrote:
>>> > It may be useful to check what happens if you replace the vmx0 interface with an
>>> > em0.
>>> > In this way you would know if the issue is vmx-specific or not.
>>>
>>> I'll put this on my to-do, can't test right now.
>>>
>>> But one thing I noticed when comparing the TCP control block of the connection
>>> before and after the "TSO dance" is that TF_TSO gets cleared after any outgoing
>>> traffic while TSO is disabled on the interface. And the flag does not come back
>>> after TSO is reenabled. Any new connections get the flag, of course.
>>>
>>> So, I indeed suspect that there is a problem with vmx TSO.
>>> As another data point, an older system from before vmx->iflib conversion does
>>> not exhibit the problem.
>>>
>>> > Il giorno gio 26 dic 2019 alle ore 20:04 Andriy Gapon <avg at freebsd.org
>>> > <mailto:avg at freebsd.org>> ha scritto:
>>> >
>>> >
>>> > Maybe someone would have any pointers for me with the following problem.
>>> > This happens with CURRENT as of the beginning of September.
>>> > I connect via ssh to a VM running on VMware, it has a single vmx0 interface.
>>> > The problem is that when I print a moderately large amount of text to the
>>> > terminal (e.g., tail -100 /var/log/messages) I literally see it printed in
>>> > chunks with noticeable pauses between chunks. It takes several seconds for all
>>> > lines to get shown. This happens every time I do it.
>>> > There is an interesting twist. If I disable TSO with ifconfig vmx0 -tso and
>>> > print the same output in the same ssh session, then the output is smooth and
>>> > fast as I would expect it. The lines scroll by almost instantly.
>>> > If then I re-enable TSO and again produce the same output in the same ssh, then
>>> > it is still fast.
>>> >
>>> > It appears that the TCP connection gets tuned to some very sub-optimal
>>> > parameters when TSO is enabled. When I disable TSO, the parameters get re-tuned
>>> > to better values and the values stick when I re-enable TSO.
>>> > This is just a conjecture, of course.
>>> >
>>> > I have some tcpdump captures, but I do not see anything that would really stand
>>> > out. One difference is that in the slow case only "full sized" packets are sent
>>> > while in the fast case there are shorter packets with push flag.
>>> >
>>> > Some packets for the slow case:
>>> > 00:00:00.453202 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>>> > 37:1485, ack 36, win 128, options [nop,nop,TS val 1403195134 ecr 4966311],
>>> > length 1448
>>> > 00:00:00.096859 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 1485,
>>> > win 1026, options [nop,nop,TS val 4966864 ecr 1403195134], length 0
>>> > 00:00:00.442963 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>>> > 1485:2933, ack 36, win 128, options [nop,nop,TS val 1403195664 ecr 4966864],
>>> > length 1448
>>> > 00:00:00.092677 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 2933,
>>> > win 1026, options [nop,nop,TS val 4967400 ecr 1403195664], length 0
>>> > 00:00:00.437336 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>>> > 2933:4381, ack 36, win 128, options [nop,nop,TS val 1403196194 ecr 4967400],
>>> > length 1448
>>> > 00:00:00.097190 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 4381,
>>> > win 1026, options [nop,nop,TS val 4967934 ecr 1403196194], length 0
>>> >
>>> > Some packets after the TSO dance:
>>> > 00:00:00.000450 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>>> > 4077:5525, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 1448
>>> > 00:00:00.000016 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>>> > 5525:6097, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 572
>>> > 00:00:00.000009 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 5525,
>>> > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>>> > 00:00:00.000303 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>>> > 6097:7545, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 1448
>>> > 00:00:00.000019 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>>> > 7545:8117, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 572
>>> > 00:00:00.000013 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 7545,
>>> > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>>> > 00:00:00.000162 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>>> > 8117:9565, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 1448
>>> > 00:00:00.000012 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>>> > 9565:10137, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>>> > length 572
>>> > 00:00:00.000007 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 9565,
>>> > win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>>> >
>>> > What else can I examine to debug the problem further?
>>> > Thank you!
>>> > --
>>> > Andriy Gapon
>>> > _______________________________________________
>>> > freebsd-net at freebsd.org <mailto:freebsd-net at freebsd.org> mailing list
>>> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
>>> > <mailto:freebsd-net-unsubscribe at freebsd.org>"
>>> >
>>>
>>
>> I am not able to test this at the moment, nor likely in the very near future, but I did have a few minutes to do some code reading and now believe that the following is part of the problem, if not the entire problem. Using r353803 as a reference, I believe line 1323 in sys/dev/vmware/vmxnet3/if_vmx.c (in vmxnet3_isc_txd_encap()) should be:
>>
>> sop->hlen = hdrlen + ipi->ipi_tcp_hlen;
>>
>> instead of the current:
>>
>> sop->hlen = hdrlen;
>>
>> This can be seen by going back to r333813 and examining the CSUM_TSO case of vmxnet3_txq_offload_ctx(). The final increment of *start in that case is what was literally lost in translation when converting the driver to iflib.
>>
>> -Patrick
More information about the freebsd-net
mailing list