pf state disappearing [ adaptive timeout bug ]

Fri Jan 22 21:35:02 UTC 2016

On Thu, Jan 21, 2016 at 11:44 AM, Matthew Grooms <mgrooms at shrew.net> wrote:

> On 1/21/2016 11:04 AM, Nick Rogers wrote:
>
>> On Wed, Jan 20, 2016 at 2:01 PM, Matthew Grooms<mgrooms at shrew.net>
>> wrote:
>>
>> All,
>>>
>>> I have a curious problem with a lightly loaded pair of pf firewall
>>> running
>>> on FreeBSD 10.2-RELEASE.  I'm noticing TCP entries are disappearing from
>>> the state table for no good reason that I can see. The entry limit is set
>>> to 100000 and I never see the system go over about 70000 entries, so we
>>> shouldn't be hitting the configured limit ...
>>>
>>> In my experience if you hit the state limit, new connections/states are
>> dropped and existing states are unaffected.
>>
>
> Aha! You shook something out of the dusty depths of my slow brain :) I
> believe that what you say is true as long as adaptive timeouts are
> disabled, which by default they are not ...
>
>            Timeout values can be reduced adaptively as the number of state
> ta-
>            ble entries grows.
>
>            adaptive.start
>                  When the number of state entries exceeds this value,
> adaptive
>                  scaling begins.  All timeout values are scaled linearly
> with
>                  factor (adaptive.end - number of states) / (adaptive.end -
>                  adaptive.start).
>            adaptive.end
>                  When reaching this number of state entries, all timeout
> val-
>                  ues become zero, effectively purging all state entries
> imme-
>                  diately.  This value is used to define the scale factor,
> it
>                  should not actually be reached (set a lower state limit,
> see
>                  below).
>
>            Adaptive timeouts are enabled by default, with an adaptive.start
>            value equal to 60% of the state limit, and an adaptive.end value
>            equal to 120% of the state limit.  They can be disabled by
> setting
>            both adaptive.start and adaptive.end to 0.
>
>> # pfctl -sm
>>> states        hard limit   100000
>>> src-nodes     hard limit   100000
>>> frags         hard limit    50000
>>> table-entries hard limit   200000
>>>
>>> # pfctl -si
>>> Status: Enabled for 78 days 14:24:18          Debug: Urgent
>>>
>>> State Table                          Total             Rate
>>>    current entries                    67829
>>>    searches                    113412118733        16700.2/s
>>>    inserts                        386313496           56.9/s
>>>    removals                       386245667           56.9/s
>>> Counters
>>>    match                          441731678           65.0/s
>>>    bad-offset                             0            0.0/s
>>>    fragment                            1090            0.0/s
>>>    short                                220            0.0/s
>>>    normalize                            761            0.0/s
>>>    memory                                 0            0.0/s
>>>    bad-timestamp                          0            0.0/s
>>>    congestion                             0            0.0/s
>>>    ip-option                        4366487            0.6/s
>>>    proto-cksum                            0            0.0/s
>>>    state-mismatch                     50334            0.0/s
>>>    state-insert                          10            0.0/s
>>>    state-limit                            0            0.0/s
>>>    src-limit                              0            0.0/s
>>>    synproxy                               0            0.0/s
>>>
>>> This problem is easy to reproduce by establishing an SSH connection to
>>> the
>>> firewall itself, letting it sit for a while and then examining the state
>>> table. After a connection is made, I can see the entry with an
>>> established:established state ...
>>>
>>> # pfctl -ss | grep X.X.X.X | grep 63446
>>> all tcp Y.Y.Y.Y:22 <- X.X.X.X:63446       ESTABLISHED:ESTABLISHED
>>>
>>> If I let the SSH session sit for a while and then try to type into the
>>> terminal on the client end, the connection stalls and produces a network
>>> error message. When I look at the pf state table again, the state entry
>>> for
>>> the connection is no longer visible. However, the ssh process is still
>>> running and I still see the TCP connection established in the output of
>>> netstat ...
>>>
>>> # netstat -na | grep 63446
>>> tcp4       0      0 Y.Y.Y.Y.22         X.X.X.X.63446     ESTABLISHED
>>>
>>> When I observe the packet flow in TCP dump when a connection stalls,
>>> packets being sent from the client are visible on the physical interface
>>> but are shown as blocked on the pflog0 interface.
>>>
>>> Does this happen with non-SSH connections? It sounds like your SSH
>> client/server interaction is not performing a keep-alive frequently enough
>> to keep the PF state established. If no packets are sent over the
>> connection (state) for some time, then PF will timeout (remove) the state.
>> At this point your SSH client still believes it has a successful
>> connection, so it tries to send packets when you resume typing, but they
>> are blocked by your PF rules which likely specify "flags S/SA keep state",
>> either explicitly or implicitly (it is the filter rule default), which
>> means block packets that don't match an existing state or are not part of
>> the initial SYN handshake of the TCP connection.
>>
>
> It happened with UDP SIP and log running HTTP sessions that sit idle as
> well. The SSH connection was just the easiest to test. Besides that, the
> default TCP timeout value for established connections is quite high at
> 86400s. An established TCP connection should be able to sit for a full day
> with no traffic before the related state table entry gets evicted.
>
> Look at your settings in pf.conf for "timeout tcp.established", which
>> affects how long before an idle ESTABLISHED state will timeout. Also look
>> into ClientAliveInterval in sshd configuration, which I believe is 0
>> (disabled) by default, which means it will let the client timeout without
>> sending a keep-alive. If you don't want PF to force timeout an idle SSH
>> connection, then ideally ClientAliveInterval is less than or equal (i.e.,
>> more-frequent) to PF's tcp.established timeout value.
>>
>
> Thanks for the suggestion! I completely forgot about the adaptive timeout
> options until I double checked the settings based on you reply :) My values
> are set to default for TCP and extended a bit for UDP. The adaptive.start
> value was calculated at 60k for the 100k state limit. That in particular
> looked way too relevant to be a coincidence. After increasing the value to
> 90k, my total state count started increasing and leveled out around 75k.
> It's always hovered around 65k up until now, so 10k sate entries were being
> discarded on a regular basis ...
>

Glad I could help somehow. I forgot about adaptive timeout as well and I've
been using a much shorter tcp.established timeout than the default for
years, so I'm familiar with the "disappearing states" problem.

> # pfctl -si
> Status: Enabled for 0 days 02:25:41           Debug: Urgent
>
> State Table                          Total             Rate
>   current entries                    77759
>   searches                       483831701        55352.0/s
>   inserts                           825821           94.5/s
>   removals                          748060           85.6/s
> Counters
>   match                           27118754         3102.5/s
>   bad-offset                             0            0.0/s
>   fragment                               0            0.0/s
>   short                                  0            0.0/s
>   normalize                              0            0.0/s
>   memory                                 0            0.0/s
>   bad-timestamp                          0            0.0/s
>   congestion                             0            0.0/s
>   ip-option                           6655            0.8/s
>   proto-cksum                            0            0.0/s
>   state-mismatch                         0            0.0/s
>   state-insert                           0            0.0/s
>   state-limit                            0            0.0/s
>   src-limit                              0            0.0/s
>   synproxy                               0            0.0/s
>
> # pfctl -st
> tcp.first                   120s
> tcp.opening                  30s
> tcp.established           86400s
> tcp.closing                 900s
> tcp.finwait                  45s
> tcp.closed                   90s
> tcp.tsdiff                   30s
> udp.first                   600s
> udp.single                  600s
> udp.multiple                900s
> icmp.first                   20s
> icmp.error                   10s
> other.first                  60s
> other.single                 30s
> other.multiple               60s
> frag                         30s
> interval                     10s
> adaptive.start            90000 states
> adaptive.end             120000 states
> src.track                     0s
>
> I think there may be a problem with the code that calculates adaptive
> timeout values that is making it way too aggressive. If by default it's
> supposed to decrease linearly between %60 and %120 of the state table max,
> I shouldn't be loosing TCP connections that are only idle for a few minutes
> when the sate table is < %70 full. Unfortunately that appears to be the
> case. At most this should have decreased the 86400s timeout by %17 to
> 72000s for established TCP connections.
>

That doesn't make sense to me either. Even if the math is off by a factor
of 10 the state should live for about 24 minutes.

> I've tested this for a few hours now and all my idle SSH sessions have
> been rock solid. If anyone else is scratching their head over a problem
> like this, I would suggest disabling the adaptive timeout feature or
> increasing it to a much higher value. Maybe one of the pf maintainers can
> chime in and shed some light on why this is happening. If not, I'm going to
> file a bug report as this certainly feels like one.
>

Did you go with making adaptive timeout less aggressive or disable it
entirely? I would think that if adaptive timeout is really that broken more
people would notice this problem, especially myself since I have many
servers running a very short tcp.established timeout, but the fact that you
are noticing this kind of weirdness has me concerned about how the adaptive
setting is affecting my environment.

> Thanks again,
>
> -Matthew
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>