Application layer classifier for ipfw
Patrick Tracanelli
eksffa at freebsdbrasil.com.br
Tue Aug 5 19:51:32 UTC 2008
Patrick Tracanelli escreveu:
> Mike Makonnen escreveu:
>> Mike Makonnen wrote:
>>> Patrick Tracanelli wrote:
>>>>
>>>> To let you know of my current (real world) tests:
>>>>
>>>> - Wireless Internet Provider 1:
>>>> - 4Mbit/s of Internet Traffic
>>>> - Classifying default protocols + soulseek + ssh
>>>> - Classifying 100Mbit/s of dump over ssh
>>>>
>>>> Results in:
>>>> No latency added, very low CPU usage, no packets dropping.
>>>>
>>>> - Wireless ISP 2:
>>>> - 21 Mbit/s of Internet Traffic
>>>> - Classifying default protocols + soulseek + ssh
>>>>
>>>> Results in:
>>>> No tcp or udp traffic at all; everything that gets diverted
>>>> never comes out of the divert socket, and ipfw-classifyd logs
>>>>
>>>> Aug 1 12:07:35 ourofino last message repeated 58 times
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: bittorrent
>>>> (rule 50000)
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: edonkey
>>>> (rule 50000)
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: fasttrack
>>>> (rule 50000)
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: gnutella
>>>> (rule 1000)
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: soulseek
>>>> (rule 50000)
>>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: ssh
>>>> (rule 50000)
>>>> Aug 1 12:18:28 ourofino ipfw-classifyd: unable to write to divert
>>>> socket: Operation not permitted
>>>> Aug 1 12:18:50 ourofino last message repeated 90 times
>>>
>>> Hmmm... this part means that the call to sendto(2) to write the
>>> packet back into network stack failed. This explains why you are not
>>> seein g any traffic comming back out of the divert socket, but I
>>> don't see why it would suddenly fail with a permission error. Could
>>> this be a kernel bug?
>>>> Aug 1 12:18:51 ourofino ipfw-classifyd: packet dropped: input queue
>>>> full
>>>> Aug 1 12:19:11 ourofino last message repeated 94 times
>>>>
>>>> Raised queue len a lot (up to 40960), when the application starts it
>>>> uses up to 25% CPU and a second after that, CPU usage gets lower the
>>>> 0.1%.
>>>
>>> This looks like a deadlock. If it weren't able to process packets
>>> fast enough the cpu usage should be high even as it's spewing "packet
>>> dropped" messages. Can you send me some more information like memory
>>> usage and the firewall script you are using? How much of the
>>> 21Mbits/s of traffic is P2P? If you reduce the number of protocols
>>> you are trying to match against does the behavior change? Using
>>> netstat -w1 -I<interface> can you tell me how many packets per second
>>> we're talking about for 4Mbits/s and 21Mbit/s? Also, the timestamps
>>> from the log file seem to show that the daemon is running for approx.
>>> 34 sec. before the first "unable to write to write to divert socket"
>>> message. Is it passing traffic during this time? Thanks.
>>>
>>> I've uploaded a newer version. Can you try that also please. It
>>> includes:
>>> o SIGHUP forces it to re-read its configuration file
>>> o rc.d script
>>> o minor optimization (calls pthread_cond_signal with the mutex
>>> unlocked)
>>> o code cleanup
>>>
>>> Also, for your convenience I have attached a patch against the
>>> earlier version that removes a debugging printf that should remove
>>> spammage to your log files (the current version has it removed already).
>>>
>>
>> Ooops, a few minutes after I sent this email I found a couple of bugs
>> (one major, and one minor). They were in the original tarball as well
>> as the newer one I uploaded earlier today. I've uploaded a fixed
>> version of the code. Can you try that instead please.
>>
>> Also, to help track down performance issues I've modified the Makefile
>> to build a profiled version of the application so you can use gprof(1)
>> to figure out where any problems lie.
>>
>> Cheers.
>>
>
> On gcc 3.4.6, -Wno-pointer-sign compiler switch is unknown and
> therefore compiling fails:
>
> cc -O2 -fno-strict-aliasing -pipe -O2 -pipe -pg -g -Wsystem-headers
> -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter
> -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type
> -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align
> -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs
> -Wredundant-decls -Wno-pointer-sign -c classifyd.c
> cc1: error: unrecognized command line option "-Wno-pointer-sign"
> *** Error code 1
>
> Stop in /usr/home/setup/ipfw-classifyd.
>
> If removed, does compile fine.
>
> On amd64 does not compile:
>
> cc -O2 -fno-strict-aliasing -pipe -O2 -pipe -pg -g -Wsystem-headers
> -Werror -Wall -Wno-
> format-y2k -W -Wno-unused-parameter -Wstrict-prototypes
> -Wmissing-prototypes -Wpointer
> -arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow
> -Wcast-align -Wunus
> ed-parameter -Wchar-subscripts -Winline -Wnested-externs
> -Wredundant-decls -Wno-pointer
> -sign -fPIC -pipe -c classifyd.c
> cc1: warnings being treated as errors
> classifyd.c: In function 'write_pthread':
> classifyd.c:813: warning: format '%d' expects type 'int', but argument 4
> has type 'size_
> t'
> *** Error code 1
>
> The rules Im running:
>
> 00001 1379 320702 fwd 192.168.100.11 ip from any to {
> 201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp0
> 00002 0 0 fwd 192.168.100.11 ip from any to {
> 201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp2
> 00003 39137 7760042 fwd 189.52.140.1 ip from 189.52.140.10 to any out
> via fxp0
> 00004 34462 10801535 fwd 189.52.140.3 ip from 189.52.140.11 to any out
> via fxp0
> 00005 68780 29374414 fwd 189.52.140.1 ip from 189.52.140.18 to any out
> via fxp0
> 00006 12154 2003678 fwd 189.52.140.3 ip from 189.52.140.19 to any out
> via fxp0
> 00500 4741 770166 divert 7777 tcp from any to any via fxp0
> 00501 172 38599 divert 7777 udp from any to any via fxp0
> 49999 823215 453482966 allow ip from any to any
> 65535 0 0 allow ip from any to any
>
> Memory usage while running classifyd:
>
> Mem: 67M Active, 27M Inact, 44M Wired, 24K Cache, 111M Buf, 859M Free
> Swap: 1024M Total, 1024M Free
>
> classifyd's CPU usage:
>
> 5198 root 3 20 0 6132K 5420K kserel 1:14 24.52%
> ipfw-classifyd
>
> System load averga for 5 minutes is 0.2 while not running classifyd and
> 0.8 while running.
>
> Logs with the latest (downloaded a few minutes ago) code:
>
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: bittorrent
> (rule 50000)
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule
> 50000)
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: fasttrack
> (rule 50000)
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: gnutella (rule
> 50000)
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: soulseek (rule
> 50000)
> Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: ssh (rule 50000)
> Aug 4 10:15:36 ourofino ipfw-classifyd: packet dropped: input queue full
> Aug 4 10:16:07 ourofino last message repeated 1594 times
> Aug 4 10:16:19 ourofino last message repeated 766 times
>
> I miss the debug lines! ;D
>
> # netstat -w1 -I fxp0
> input (fxp0) output
> packets errs bytes packets errs bytes colls
> 535 0 286009 515 0 165965 0
> 416 0 347448 456 0 153585 0
> 444 0 309008 449 0 185087 0
> 493 0 190090 474 0 202637 0
> 547 0 197882 562 0 233751 0
> 409 0 288318 461 0 249421 0
> 434 0 173193 464 0 261512 0
> 437 0 246516 450 0 258908 0
> 458 0 200348 481 0 157800 0
> 423 0 250006 427 0 97723 0
> 367 0 261323 349 0 96405 0
> 592 0 178843 556 0 91032 0
> 450 0 248174 432 0 112650 0
> 390 0 244258 349 0 106532 0
> 442 0 303159 392 0 125664 0
> 407 0 237804 350 0 84367 0
> 460 0 207586 456 0 86633 0
> 364 0 200758 366 0 99798 0
> 313 0 241031 325 0 85746 0
> 463 0 248284 470 0 113385 0
>
> With above protocols, wont matter if I run with the default queue len,
> with -q 4096 or -q 40960. Its always never enough and I get the "input
> queue full" logs.
>
> However, if I remove add only 1 or 2 protocols (tried bittorrend only
> and later, edonkey only), things do work fine, for a few seconds, but than:
>
> Aug 4 10:22:58 ourofino ipfw-classifyd: Loaded Protocol: bittorrent
> (rule 50000)
> Aug 4 10:24:32 ourofino ipfw-classifyd: Loaded Protocol: bittorrent
> (rule 50000)
> Aug 4 10:28:02 ourofino ipfw-classifyd: unable to write to divert
> socket: Invalid argument
> Aug 4 10:28:33 ourofino last message repeated 20 times
> Aug 4 10:29:03 ourofino last message repeated 18 times
> Aug 4 10:29:05 ourofino kernel: pid 5654 (ipfw-classifyd), uid 0:
> exited on signal 6 (core dumped)
> Aug 4 10:30:06 ourofino ipfw-classifyd: Loaded Protocol: bittorrent
> (rule 50000)
> Aug 4 10:30:08 ourofino ipfw-classifyd: packet dropped: input queue full
> Aug 4 10:30:09 ourofino last message repeated 186 times
> Aug 4 10:30:17 ourofino ipfw-classifyd: unable to write to divert
> socket: Invalid argument
> Aug 4 10:30:27 ourofino last message repeated 2 times
> Aug 4 10:32:43 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule
> 50000)
> Aug 4 10:33:04 ourofino ipfw-classifyd: unable to write to divert
> socket: Invalid argument
> Aug 4 10:33:22 ourofino last message repeated 11 times
> Aug 4 10:33:23 ourofino kernel: pid 5901 (ipfw-classifyd), uid 0:
> exited on signal 6 (core dumped)
>
> I get the " unable to write to divert socket: Invalid argument"
> constantly and when classifyd dies (core dump with signal 6) I get the
> following debug output:
>
> Assertion failed: (datalen >= 0), function classify_pthread, file
> classifyd.c, line 646.
>
> What other useful output I can send you?
>
I tried figuring out myself some extra clues, but still stucked where I
began.
--
Patrick Tracanelli
FreeBSD Brasil LTDA.
Tel.: (31) 3516-0800
316601 at sip.freebsdbrasil.com.br
http://www.freebsdbrasil.com.br
"Long live Hanin Elias, Kim Deal!"
More information about the freebsd-net
mailing list