[PATCH] ng_tag - new netgraph node,
please test (L7 filtering possibility)
Alexander V. Chernikov
admin at su29.net
Wed Jun 14 14:30:32 UTC 2006
Hello Vadim.
you wrote 12 июня 2006 г., 16:48:50:
Vadim Goncharov> 12.06.06 @ 05:34 Eduardo Meyer wrote:
>> I read the messages and man page but did not understand. Maybe it is
>> my lack of knowledge regarding netgraph? Well, in man page it seems
>> that you looked at ipfw source code (.h in fact) to find out the tag
>> number. Can you explain this?
Vadim Goncharov> Yes, netgraph always was a semi-programmer system, less or more,
Vadim Goncharov> especially true with ng_tag, as it tries to be
Vadim Goncharov> generalized mbuf_tags(9)
Vadim Goncharov> manipulating interface, and this is more kernel internals. For simple
Vadim Goncharov> using, however, you don't need to bother all that details - just remember
Vadim Goncharov> magic number and where to place it, and it is now
Vadim Goncharov> simple for use with ipfw
Vadim Goncharov> tags.
>> A practical example, how could I, for example, block Kazaa or
>> bittorrent based on L7 with ng_tag? Can you please explain the steps
>> on how to do this?
Unfortunately i have never analyzed p2p protocols, but for ICQ
solution is very simple. Of course, this is not what you asked for,
but it is a practical example using ng_tag node
At the beginning of icq session server always sends us 10 bytes
length packet of data:
2A 01 XX XX 00 04 00 00 00 01
^ ^ ^ ^ ^ ^
^ ^ SEQ Length
^ Channel
ICQ flap ID
so we can easily match and block this packet with iplen = 50 in ipfw
and by 8 bytes exact match in ng_bpf
The following line is for ng_bpf(4) script from manpage
PATTERN="ether[40:2] = 0x2A01 and ether[44:4] = 0x00040000 and
ether[48:2] = 0x0001"
Vadim Goncharov> The truth is that, in fact, ng_tag doesn't do any traffic analysis. It
Vadim Goncharov> merely provides an easy way to distinguish different packets after
Vadim Goncharov> returning to ipfw. Currently the only analyzing node in FreeBSD src tree
Vadim Goncharov> is ng_bpf(4), but it merely splits incoming packets in two streams,
Vadim Goncharov> matched and not. There are reasons to this, as netgraph needs to be
Vadim Goncharov> modular, and each node does a small thing, but does it well. For long time
Vadim Goncharov> ng_bpf was used for another purposes in the kernel, and now, as new ipfw
Vadim Goncharov> features appeared, ng_tag came up for easy integration.
Vadim Goncharov> So, that's merely a framework allowing you to create custom filters, and
Vadim Goncharov> if you need to match some kind of traffic, you
Vadim Goncharov> should sit, understand what
Vadim Goncharov> patterns that traffic has and then program ng_bpf(4) with appropriate
Vadim Goncharov> filter. In fact, it allows to create it from
Vadim Goncharov> tcpdump(1) expressions, so
Vadim Goncharov> you don't need to be a C programmer, and that's good, isn't it? :)
>> I don't run -CURRENT but I need this kind of feature very much, I am
>> downloading a 7.0 snapshot just to test this with ipfw tag.
Vadim Goncharov> You'll be able to do this with RELENG_6 about two weeks later. I simply
Vadim Goncharov> couldn't wait a month for MFC and wrote it earlier :)
>> How this addresses the problem on system level L7 filtering? I always
>> though that someone would show up with a userland application that
>> tags packets and returns the tag to ipfw filtering, but you came up
>> with a kernel approach. How better and why it is when compared to evil
>> regexp evaluation on kernel or how efficient is this when compared to
>> Linux L7 which is know to fail a lot (let a number of packets pass)?
Vadim Goncharov> Yes, in general case you do - correct way is to have a userland
Vadim Goncharov> application which will do analysis, this easier, simpler and safer
Vadim Goncharov> (imagine a security flaw inside kernel matcher?). Like snort. But the
Vadim Goncharov> main disadvantage - it is SLOW. And for many kinds of traffic you do not
Vadim Goncharov> need to perform complete flow analysis, as that is simple enough to do
Vadim Goncharov> per-packet matching, then to say "Huh.. I found such packet, so entire
Vadim Goncharov> connection must be of that type". Actually, I've
Vadim Goncharov> found Linux iptables P2P
Vadim Goncharov> matching module named ipp2p at http://www.ipp2p.org/ which was told to
Vadim Goncharov> work reasonable well, looked at the code and found that one-packet match
Vadim Goncharov> is enough for this work. So, per-packet matching can be implemented in
Vadim Goncharov> kernel.
Vadim Goncharov> After that I've discovered that FreeBSD already have in-kernel packet
Vadim Goncharov> matcher for a long time, since 4.0. Briefly
Vadim Goncharov> inspecting ipp2p code shown
Vadim Goncharov> that most recognized P2P types can be matched by tcpdump and thus are
Vadim Goncharov> programmable on ng_bpf(4). For some patterns, still, that's not enough, as
Vadim Goncharov> bpf can't search for a substring on a variable, not fixed, offset. Then we
Vadim Goncharov> can imagine another netgraph node which will do substring search (like
Vadim Goncharov> iptables --string), so with both bpf and
Vadim Goncharov> string-matching all P2P traffic
Vadim Goncharov> can be caught.
Vadim Goncharov> Anyway, that work yet to be done. The main benefit of ng_tag at the moment
Vadim Goncharov> is that everybody wishing this have no longer
Vadim Goncharov> principial barriers to do,
Vadim Goncharov> like needing skills to write kernel module or even userland matching
Vadim Goncharov> daemon.
>> Sorry for all those questions, but I am an end user in the average,
>> so, I can not understand it myself only reading the code.
>>
>> Thank you for your work and help. It seems that I will have a 7.0
>> snapshot doing this job to me untill the ipfw tag MFC happens, if I
>> can understand this approach.
Vadim Goncharov> I hope that my explanation was helpful enough to understand :) Also, if
Vadim Goncharov> you will be using 7.0, include BPF_JITTER in your kernel config as this
Vadim Goncharov> will enable native code-compiling for bpf and ng_bpf - this will speed
Vadim Goncharov> things up.
Vadim Goncharov> ==========================================================================
Vadim Goncharov> P.S. Here is quick-and-dirty primer how to convert ipp2p functions to
Vadim Goncharov> ng_bpf(4) input expression for tcpdump(1). Go to
Vadim Goncharov> http://www.ipp2p.org/ and
Vadim Goncharov> download source, unpack and open file pt_ipp2p.c and find function for
Vadim Goncharov> your P2P type, let it be BitTorrent for our example. So look (I've
Vadim Goncharov> formatted that bad Linux code a little to be a more style(9)'ish):
Vadim Goncharov> int
Vadim Goncharov> search_bittorrent (const unsigned char *payload, const u16 plen)
Vadim Goncharov> {
Vadim Goncharov> if (plen > 20) {
Vadim Goncharov> /* test for match 0x13+"BitTorrent protocol" */
Vadim Goncharov> if (payload[0] == 0x13)
Vadim Goncharov> if (memcmp(payload+1, "BitTorrent protocol", 19) == 0)
Vadim Goncharov> return (IPP2P_BIT * 100);
Vadim Goncharov> /* get tracker commandos, all starts with GET /
Vadim Goncharov> * then it can follow: scrape| announce
Vadim Goncharov> * and then ?hash_info=
Vadim Goncharov> */
Vadim Goncharov> if (memcmp(payload,"GET /",5) == 0) {
Vadim Goncharov> /* message scrape */
Vadim Goncharov> if (memcmp(payload+5, "scrape?info_hash=", 17)==0)
Vadim Goncharov> return (IPP2P_BIT * 100 + 1);
Vadim Goncharov> /* message announce */
Vadim Goncharov> if (memcmp(payload+5, "announce?info_hash=", 19)==0)
Vadim Goncharov> return (IPP2P_BIT * 100 + 2);
Vadim Goncharov> }
Vadim Goncharov> } else {
Vadim Goncharov> /*
Vadim Goncharov> * bitcomet encryptes the first packet, so we have to detect another
Vadim Goncharov> * one later in the flow
Vadim Goncharov> */
Vadim Goncharov> /* first try failed, too many missdetections */
Vadim Goncharov> //if (size == 5 && get_u32(t,0) ==
Vadim Goncharov> __constant_htonl(1) && t[4] < 3)
Vadim Goncharov> // return (IPP2P_BIT * 100 + 3);
Vadim Goncharov>
Vadim Goncharov> /* second try: block request packets */
Vadim Goncharov> if ((plen == 17) &&
Vadim Goncharov> (get_u32(payload,0) == __constant_htonl(0x0d)) &&
Vadim Goncharov> (payload[4] == 0x06) &&
Vadim Goncharov> (get_u32(payload,13) == __constant_htonl(0x4000)))
Vadim Goncharov> return (IPP2P_BIT * 100 + 3);
Vadim Goncharov> }
Vadim Goncharov> return 0;
Vadim Goncharov> }
Vadim Goncharov> So, what do we see? BitTorrent packet can start with one of three fixed
Vadim Goncharov> strings (we see memcmp() checks for them). Author of ipp2p employs one
Vadim Goncharov> more check, but as we can see from comments, he's not sure.
Vadim Goncharov> Let's find out what are the byte sequences for these strings:
Vadim Goncharov> $ echo -n "BitTorrent protocol" | hd
Vadim Goncharov> 00000000 42 69 74 54 6f 72 72 65 6e 74 20 70 72 6f 74 6f |BitTorrent
Vadim Goncharov> proto|
Vadim Goncharov> 00000010 63 6f 6c |col|
Vadim Goncharov> 00000013
Vadim Goncharov> $ echo -n "GET /scrape?info_hash=" | hd
Vadim Goncharov> 00000000 47 45 54 20 2f 73 63 72 61 70 65 3f 69 6e 66 6f |GET
Vadim Goncharov> /scrape?info|
Vadim Goncharov> 00000010 5f 68 61 73 68 3d |_hash=|
Vadim Goncharov> 00000016
Vadim Goncharov> $ echo -n "GET /announce?info_hash=" | hd
Vadim Goncharov> 00000000 47 45 54 20 2f 61 6e 6e 6f 75 6e 63 65 3f 69 6e |GET
Vadim Goncharov> /announce?in|
Vadim Goncharov> 00000010 66 6f 5f 68 61 73 68 3d |fo_hash=|
Vadim Goncharov> 00000018
Vadim Goncharov> We can give 1, 2 or 4 bytes to tcpdump for comarison at one time. The
Vadim Goncharov> "payload" variable in the source points to beginning of data in TCP
Vadim Goncharov> packet. Remember from man ng_tag that tcpdump assumes packets to have
Vadim Goncharov> 14-byte Ethernet header for it's arrays like
Vadim Goncharov> "tcp[]", but packets come
Vadim Goncharov> from ipfw to ng_bpf without this header, and that affects our offset
Vadim Goncharov> calculations. So we must give offsets from very beginning of packets,
Vadim Goncharov> which is done through "ether[]" tcpdump's prime, and parse headers
Vadim Goncharov> manually. Let's assume (for simplicity and speed), however, that IP and
Vadim Goncharov> TCP headers have no any options and thus always have length 20 bytes each,
Vadim Goncharov> then ipp2p's "payload[0]" will be tcpdump's "ether[40]". Also, let's
Vadim Goncharov> assume that ipfw checked packet len for us so we
Vadim Goncharov> don't do that in netgraph
Vadim Goncharov> too.
Vadim Goncharov> Then, we simply take hex bytes in order hd(1) told us, as this is network
Vadim Goncharov> byte order also, and write them as tcpdump
Vadim Goncharov> expressions (remember that
Vadim Goncharov> first string ("...protocol") actually have 0x13
Vadim Goncharov> prepended to it). So, we
Vadim Goncharov> write follow in ng_bpf(4) script:
Vadim Goncharov> PATTERN="(ether[40:4]=0x13426974 &&
Vadim Goncharov> ether[44:4]=0x546f7272 &&
Vadim Goncharov> ether[48:4]=0x656e7420 &&
Vadim Goncharov> ether[52:4]=0x70726f74 &&
Vadim Goncharov> ether[56:4]=0x6f636f6c
Vadim Goncharov> ) ||
Vadim Goncharov> (ether[40:4]=0x47455420 &&
Vadim Goncharov> (ether[44:4]=0x2f736372 &&
Vadim Goncharov> ether[48:4]=0x6170653f &&
Vadim Goncharov> ether[52:4]=0x696e666f &&
Vadim Goncharov> ether[56:4]=0x5f686173 &&
Vadim Goncharov> ether[60:2]=0x683d
Vadim Goncharov> ) ||
Vadim Goncharov> (ether[44:4]=0x2f616e6e &&
Vadim Goncharov> ether[48:4]=0x6f756e63 &&
Vadim Goncharov> ether[52:4]=0x653f696e &&
Vadim Goncharov> ether[56:4]=0x666f5f68 &&
Vadim Goncharov> ether[60:4]=0x6173683d)
Vadim Goncharov> ) ||
Vadim Goncharov> (ether[2:2]=57 &&
Vadim Goncharov> ether[40:4]=0x0000000d &&
Vadim Goncharov> ether[44]=0x06 &&
Vadim Goncharov> ether[53:4]=0x00004000)"
Vadim Goncharov> Note the last OR block in expression - this is
Vadim Goncharov> translation of that "not
Vadim Goncharov> sure" checking request packets. I've explicitly written packet length -
Vadim Goncharov> plen=17 + 20 byte IP header len + 20 byte TCP header len, check at offset
Vadim Goncharov> 2 in IP header, according to RFC 791. Construction "get_u32 ==
Vadim Goncharov> __constant_htonl()" means comparing 4-byte values at given offset.
Vadim Goncharov> P.P.S. I have not tested that pattern on real packets, as I have no
Vadim Goncharov> BitTorrent today, but it should work.
--
Cheers,
Alexander V. Chernikov mailto:admin at su29.net
More information about the freebsd-net
mailing list