TCP question: Is this simultaneous close handling broken?

John-Mark Gurney jmg at funkthat.com
Tue Jan 7 20:45:10 UTC 2014


Peter Wemm wrote this message on Tue, Jan 07, 2014 at 12:10 -0800:
> On 1/6/14, 3:23 PM, Peter Wemm wrote:
> > We've hit a weird problem at work when dealing with simultaneous closes.
> > In this particular case, it's a FreeBSD-7.4 machine talking some random
> > Linux host.
> > 
> > There is a client/server protocol in use, and both ends are doing a close
> > at the same time.  It might be a shutdown, I haven't seen all the code yet.
> [..]
> > A packet capture, with relative timestamps:
> > 
> > 000050 freebsd.28411 > linux.14001: F 6486:6486(0) ack 232
> > 000031 linux.14001 > freebsd.28411: F 232:232(0) ack 6486
> > 000333 linux.14001 > freebsd.28411: . ack 6487
> > [200ms retransmit timer fires on linux]
> > 200490 linux.14001 > freebsd.28411: F 232:232(0) ack 6487
> > 000011 freebsd.28411 > linux.14001: . ack 233
> [..]
> > What am I looking at?  Who's at fault?  It looks like we're failing to
> > recognize the ack for our fin.
> 
> It definitely looks like FreeBSD at fault.  We've simply not acked their FIN
> until they retransmitted it.
> 
> I've looked at the commit logs and I don't see anything obvious that stands
> out to me for a fix for this.  Most of the changes seem to be lock structure
> changes than protocol fixes.  I see things like ECN and other protocol
> features being added as well.
> 
> Where should I look in the code?

I've been looking in tcp_input.c.  When we send the FIN, we are in
FIN_WAIT_1, and then upon receiving the FIN, we should transition to
CLOSING.  This happens in tcp_do_segment when we receive a packet w/
the _FIN bit set while in FIN_WAIT_1.  The next question is if we are
hitting this code (maybe a printf), why isn't the packet being sent
out...  Only a page or so down from this, you see:
        /*
         * Return any desired output.
         */
        if (needoutput || (tp->t_flags & TF_ACKNOW))
                (void) tcp_output(tp);

And the only what TF_ACKNOW isn't set is if for some reason the TF_NEEDSYN
flag is still set (from just above the previous code)...

So, maybe a printf on the transition to _CLOSING to make sure it's hit,
plus a print of t_flags at the same location to make sure _NEEDSYN isn't
set would help us understand what is wrong...

If we don't get the printf, then there is other weird stuff going on...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."


More information about the freebsd-net mailing list