seq# of RST in tcp_dropwithreset

Thu Jun 14 00:56:51 UTC 2012

On 06/13/12 06:30, Andre Oppermann wrote:
> On 07.06.2012 22:28, George Neville-Neil wrote:
>>
>> On Mar 27, 2012, at 18:13 , Navdeep Parhar wrote:
>>
>>> When the kernel decides to respond with a RST to an incoming TCP
>>> segment, it uses its ack# (if valid) as the seq# of the RST.  See this
>>> in tcp_dropwithreset:
>>>
>>>     if (th->th_flags&  TH_ACK) {
>>>         tcp_respond(tp, mtod(m, void *), th, m, (tcp_seq)0,
>>>             th->th_ack, TH_RST);
>>>     } else {
>>>         if (th->th_flags&  TH_SYN)
>>>             tlen++;
>>>         tcp_respond(tp, mtod(m, void *), th, m, th->th_seq+tlen,
>>>             (tcp_seq)0, TH_RST|TH_ACK);
>>>     }
>>>
>>> This can have some unexpected results.  I observed this on a link with
>>> a very high delay (B is FreeBSD, A could be anything).
>>>
>>> 1. There is a segment in flight from A to B.  The ack# is X (all tx
>>> from B to A is up to date and acknowledged).
>>> 2. socket is closed on B.  B sends a FIN with seq# X.
>>> 3. The segment from A arrives and elicits a RST from B.  The seq# of
>>> this RST will again be X.  A receives the FIN and then the RST with
>>> identical sequence numbers.  The situation resolves itself eventually,
>>> when A retransmits and the retransmitted segment ACKs the FIN too and
>>> so the next time around B sends a RST with the "correct" seq# (one
>>> after the FIN).
>>>
>>> If there is a local tcpcb for the connection with state>=
>>> ESTABLISHED, wouldn't it be more accurate to use its snd_max as the
>>> seq# of the RST?
>>>
>>
>> Hi Navdeep,
>>
>> Sorry I missed this so many months ago, but jhb@ was kind enough to
>> point this
>> query out to me.  My understanding of correct operation in this case,
>> is that we
>> do not want to move the sequence number until we have received the ACK
>> of our
>> FIN, as any other value would indicate to the TCP on A that we have
>> received their
>> ACK of our FIN, which, in this case, we have not.  The fact that there
>> isn't a better
>> way to indicate the error is a tad annoying, but, and others can
>> correct me if they think
>> I'm wrong, this is the correct way for the stacks to come to eventual
>> agreement
>> on the closing of the connection.
>
> In this case Navdeep is correct.  As long as a tcpcb is around no RST
> should be generated in step 3 if we are in FIN_WAIT_1, FIN_WAIT_2, CLOSING
> or TIME_WAIT.

Why not?  Generating a RST is the right thing to do when faced with 
excess data that cannot be delivered because the local socket has 
closed.  My question/concern was about the seq# that the kernel's TCP 
chose for this RST, not the RST itself.

> What is the code path leading to tcp_dropwithreset()?  Normally this should
> only be reached if no tcpcb is found.

I don't remember it off the top of my head, but a quick look at 
tcp_input.c suggests it must have been this piece of code in 
tcp_do_segment():

	/*
	 * If new data are received on a connection after the
	 * user processes are gone, then RST the other end.
	 */
	if ((so->so_state & SS_NOFDREF) &&
	    tp->t_state > TCPS_CLOSE_WAIT && tlen) {

Regards,
Navdeep