read() returns ETIMEDOUT on steady TCP connection
Andre Oppermann
andre at freebsd.org
Tue Apr 22 23:47:13 UTC 2008
Andre Oppermann wrote:
> Mark Hills wrote:
>> On Mon, 21 Apr 2008, Andre Oppermann wrote:
>>
>>> Mark Hills wrote:
>>>> On Sun, 20 Apr 2008, Peter Jeremy wrote:
>>
>>>>> I can't explain the problem but it definitely looks like a resource
>>>>> starvation issue within the kernel.
>>>>
>>>> I've traced the source of the ETIMEDOUT within the kernel to
>>>> tcp_timer_rexmt() in tcp_timer.c:
>>>>
>>>> if (++tp->t_rxtshift > TCP_MAXRXTSHIFT) {
>>>> tp->t_rxtshift = TCP_MAXRXTSHIFT;
>>>> tcpstat.tcps_timeoutdrop++;
>>>> tp = tcp_drop(tp, tp->t_softerror ?
>>>> tp->t_softerror : ETIMEDOUT);
>>>> goto out;
>>>> }
>>>
>>> Yes, this is related to either lack of mbufs to create a segment
>>> or a problem in sending it. That may be full interface queue, a
>>> bandwidth manager (dummynet) or some firewall internally rejecting
>>> the segment (ipfw, pf). Do you run any firewall in stateful mode?
>>
>> There's no firewall running.
>>
>>>> I'm new to FreeBSD, but it seems to implies that it's reaching a
>>>> limit of a number of retransmits of sending ACKs on the TCP
>>>> connection receiving the inbound data? But I checked this using
>>>> tcpdump on the server and could see no retransmissions.
>>>
>>> When you have internal problems the segment never makes it to the
>>> wire and thus you wont see it in tcpdump.
>>>
>>> Please report the output of 'netstat -s -p tcp' and 'netstat -m'.
>>
>> Posted below. You can see it it in there: "131 connections dropped by
>> rexmit timeout"
>>
>>>> As a test, I ran a simulation with the necessary changes to increase
>>>> TCP_MAXRXTSHIFT (including adding appropriate entries to
>>>> tcp_sync_backoff[] and tcp_backoff[]) and it appeared I was able to
>>>> reduce the frequency of the problem occurring, but not to a usable
>>>> level.
>>>
>>> Possible causes are timers that fire too early. Resource starvation
>>> (you are doing a lot of traffic). Or of course some bug in the code.
>>
>> As I said in my original email, the data transfer doesn't stop or
>> splutter, it's simply cut mid-flow. Sounds like something happening
>> prematurely.
>>
>> Thanks for the help,
>
> The output doesn't show any obvious problems. I have to write some
> debug code to run on your system. I'll do that later today if time
> permits. Otherwise tomorrow.
http://people.freebsd.org/~andre/tcp_output-error-log.diff
Please apply this patch and enable the sysctl net.inet.tcp.log_debug=1
and report any output. You likely get some (normal) noise from syncache.
What we are looking for is reports from tcp_output.
--
Andre
More information about the freebsd-net
mailing list