pf & NAT issue
Bakul Shah
bakul at bitblocks.com
Fri Jan 20 20:31:08 UTC 2017
On Fri, 20 Jan 2017 08:47:43 MST Alan Somers <asomers at freebsd.org> wrote:
> On Fri, Jan 20, 2017 at 3:48 AM, Kristof Provost <kp at freebsd.org> wrote:
> > On 20 Jan 2017, at 9:35, Bakul Shah wrote:
> >>
> >> pf seems to drop NAT connections quite a bit. This seems to
> >> happen much more frequently if there are delays involved (slow
> >> server or interactive use). Almost seems like pf losing
> >> track of NATted connections due to an uninitialized
> >> variable.... Often a retry or two works. Connecting from
> >> outside to forwarded connections to NATTED hosts works fine.
> >>
> >> This problem started after ungrading to freebsd-10. Is there a
> >> bug fix in works or a known work around (other than using ipfw
> >> or reverting to 9, which I don't want to do)?
> >>
> > The problem you describe doesn't immediately ring a bell.
> >
> > We'll have to gather a bit more information:
> >
> > * What FreeBSD version are you running exactly?
> > * What's your pf.conf?
> > * Can you perform a network capture of rejected/failed connections? Ideally
> > both on LAN and WAN on the gateway machine. Please capture full packets
> > (so
> > tcpdump -s0 -w lan.pcap) as pcap files).
> > * What networking cards are you using?
> >
> > Regards,
> > Kristof
>
> Under heavy load, pf can drop information from its state table. You
> can try increasing state table limits to see if it helps the problem.
> Read the "set limits" section of the pf man page.
>
> -Alan
Thanks for the suggestions. Here's some info. My inline
comments are indented.
$ uname -rm
10.3-RELEASE-p4 i386
$ netstat -n | grep tcp | wc -l
13
So the machine is lightly loaded.
$ grep -v ^# /etc/pf.conf|uniq
ext_if="rl0"
int_if="em0"
nat on $ext_if inet from ! ($ext_if) to any -> ($ext_if)
I took out rdr entries during testing. They don't seem
to affect this issue. I had changed src.track timeout
to 30 seconds but that didn't seem to change anything.
$ pfctl -s memory
states hard limit 10000
src-nodes hard limit 10000
frags hard limit 5000
table-entries hard limit 200000
$ pfctl -s info
Status: Enabled for 167 days 13:40:11 Debug: Urgent
State Table Total Rate
current entries 0
searches 2870986757 198.3/s # this seems high...
inserts 3428240 0.2/s
removals 3428240 0.2/s
Counters
match 1482741914 102.4/s
bad-offset 0 0.0/s
fragment 1 0.0/s
short 0 0.0/s
normalize 0 0.0/s
memory 0 0.0/s
bad-timestamp 0 0.0/s
congestion 0 0.0/s
ip-option 31 0.0/s
proto-cksum 0 0.0/s
state-mismatch 28931 0.0/s
state-insert 1 0.0/s
state-limit 0 0.0/s
src-limit 0 0.0/s
synproxy 0 0.0/s
$ tcpdump -ni rl0 host ftp4.freebsd.org # in one window
$ tcpdump -ni em0 host 192.168.125.7 # in another
On an internal machine I did "telnet ftp4.freebsd.org
ftp", waited for a while and then typed something.
The following trace is interspersed in the correct
sequence. Traffic on rl0 (external) is prefixed with <
and traffic on em0 (internal )with >.
> 11:56:05.743745 IP 192.168.125.7.65042 > 149.20.1.200.21: Flags [S], seq 3080825146, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 176000 ecr 0], length 0
< 11:56:05.743776 IP 173.228.5.8.63716 > 149.20.1.200.21: Flags [S], seq 3080825146, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 176000 ecr 0], length 0
< 11:56:05.763294 IP 149.20.1.200.21 > 173.228.5.8.63716: Flags [S.], seq 3912707359, ack 3080825147, win 65535, options [mss 1460,nop,wscale 11,sackOK,TS val 1468113699 ecr 176000], length 0
> 11:56:05.763313 IP 149.20.1.200.21 > 192.168.125.7.65042: Flags [S.], seq 3912707359, ack 3080825147, win 65535, options [mss 1460,nop,wscale 11,sackOK,TS val 1468113699 ecr 176000], length 0
> 11:56:05.764106 IP 192.168.125.7.65042 > 149.20.1.200.21: Flags [.], ack 1, win 1026, options [nop,nop,TS val 176021 ecr 1468113699], length 0
< 11:56:05.764121 IP 173.228.5.8.63716 > 149.20.1.200.21: Flags [.], ack 1, win 1026, options [nop,nop,TS val 176021 ecr 1468113699], length 0
< 11:56:05.789192 IP 149.20.1.200.21 > 173.228.5.8.63716: Flags [P.], seq 1:55, ack 1, win 32, options [nop,nop,TS val 1468113725 ecr 176021], length 54
> 11:56:05.789204 IP 149.20.1.200.21 > 192.168.125.7.65042: Flags [P.], seq 1:55, ack 1, win 32, options [nop,nop,TS val 1468113725 ecr 176021], length 54
> 11:56:05.895660 IP 192.168.125.7.65042 > 149.20.1.200.21: Flags [.], ack 55, win 1026, options [nop,nop,TS val 176152 ecr 1468113725], length 0
< 11:56:05.895675 IP 173.228.5.8.63716 > 149.20.1.200.21: Flags [.], ack 55, win 1026, options [nop,nop,TS val 176152 ecr 1468113725], length 0
> 11:56:28.168693 IP 192.168.125.7.65042 > 149.20.1.200.21: Flags [P.], seq 1:10, ack 55, win 1026, options [nop,nop,TS val 198426 ecr 1468113725], length 9
< 11:56:28.168712 IP 173.228.5.8.52015 > 149.20.1.200.21: Flags [P.], seq 3080825147:3080825156, ack 3912707414, win 1026, options [nop,nop,TS val 198426 ecr 1468113725], length 9
Right here we see the problem. NAT mapping for the
port changed from 63716 to 52015.
Bakul
More information about the freebsd-net
mailing list